scispace - formally typeset
Search or ask a question

Showing papers in "BMC Evolutionary Biology in 2007"


Journal ArticleDOI
TL;DR: BEAST is a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree that provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions.
Abstract: The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.

11,916 citations


Journal ArticleDOI
TL;DR: The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment.
Abstract: Thanks to the large amount of signal contained in genome-wide sequence alignments, phylogenomic analyses are converging towards highly supported trees. However, high statistical support does not imply that the tree is accurate. Systematic errors, such as the Long Branch Attraction (LBA) artefact, can be misleading, in particular when the taxon sampling is poor, or the outgroup is distant. In an otherwise consistent probabilistic framework, systematic errors in genome-wide analyses can be traced back to model mis-specification problems, which suggests that better models of sequence evolution should be devised, that would be more robust to tree reconstruction artefacts, even under the most challenging conditions. We focus on a well characterized LBA artefact analyzed in a previous phylogenomic study of the metazoan tree, in which two fast-evolving animal phyla, nematodes and platyhelminths, emerge either at the base of all other Bilateria, or within protostomes, depending on the outgroup. We use this artefactual result as a case study for comparing the robustness of two alternative models: a standard, site-homogeneous model, based on an empirical matrix of amino-acid replacement (WAG), and a site-heterogeneous mixture model (CAT). In parallel, we propose a posterior predictive test, allowing one to measure how well a model acknowledges sequence saturation. Adopting a Bayesian framework, we show that the LBA artefact observed under WAG disappears when the site-heterogeneous model CAT is used. Using cross-validation, we further demonstrate that CAT has a better statistical fit than WAG on this data set. Finally, using our statistical goodness-of-fit test, we show that CAT, but not WAG, correctly accounts for the overall level of saturation, and that this is due to a better estimation of site-specific amino-acid preferences. The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment. More generally, our results provide strong evidence that site-specificities in the substitution process need be accounted for in order to obtain more reliable phylogenetic trees.

591 citations


Journal ArticleDOI
TL;DR: Results from regression analysis indicate that cryptic species are almost evenly distributed among major metazoan taxa and biogeographical regions when corrected for species richness and study intensity, indicating that morphological stasis represents an evolutionary constant and that cryptic meetazoan diversity does predictably affect estimates of earth's animal diversity.
Abstract: Cryptic species are two or more distinct but morphologically similar species that were classified as a single species. During the past two decades we observed an exponential growth of publications on cryptic species. Recently published reviews have demonstrated cryptic species have profound consequences on many biological disciplines. It has been proposed that their distribution is non-random across taxa and biomes. We analysed a literature database for the taxonomic and biogeographical distribution of cryptic animal species reports. Results from regression analysis indicate that cryptic species are almost evenly distributed among major metazoan taxa and biogeographical regions when corrected for species richness and study intensity. This indicates that morphological stasis represents an evolutionary constant and that cryptic metazoan diversity does predictably affect estimates of earth's animal diversity. Our findings have direct theoretical and practical consequences for a number of prevailing biological questions with regard to global biodiversity estimates, conservation efforts and global taxonomic initiatives.

534 citations


Journal ArticleDOI
TL;DR: A practical approach that systematically compares whole genome sequences to identify single-copy nuclear gene markers for inferring phylogeny is presented and is an improvement over traditional approaches because it uses genomic information and automates the process to identify large numbers of candidate makers.
Abstract: Molecular systematics occupies one of the central stages in biology in the genomic era, ushered in by unprecedented progress in DNA technology. The inference of organismal phylogeny is now based on many independent genetic loci, a widely accepted approach to assemble the tree of life. Surprisingly, this approach is hindered by lack of appropriate nuclear gene markers for many taxonomic groups especially at high taxonomic level, partially due to the lack of tools for efficiently developing new phylogenetic makers. We report here a genome-comparison strategy to identifying nuclear gene markers for phylogenetic inference and apply it to the ray-finned fishes – the largest vertebrate clade in need of phylogenetic resolution. A total of 154 candidate molecular markers – relatively well conserved, putatively single-copy gene fragments with long, uninterrupted exons – were obtained by comparing whole genome sequences of two model organisms, Danio rerio and Takifugu rubripes. Experimental tests of 15 of these (randomly picked) markers on 36 taxa (representing two-thirds of the ray-finned fish orders) demonstrate the feasibility of amplifying by PCR and directly sequencing most of these candidates from whole genomic DNA in a vast diversity of fish species. Preliminary phylogenetic analyses of sequence data obtained for 14 taxa and 10 markers (total of 7,872 bp for each species) are encouraging, suggesting that the markers obtained will make significant contributions to future fish phylogenetic studies. We present a practical approach that systematically compares whole genome sequences to identify single-copy nuclear gene markers for inferring phylogeny. Our method is an improvement over traditional approaches (e.g., manually picking genes for testing) because it uses genomic information and automates the process to identify large numbers of candidate makers. This approach is shown here to be successful for fishes, but also could be applied to other groups of organisms for which two or more complete genome sequences exist, which has important implications for assembling the tree of life.

360 citations


Journal ArticleDOI
TL;DR: The first comprehensive analysis of the evolution and conserved functions of Nox and Duox family members, including identification of conserved amino acid residues is provided, providing a guide for future structure-function studies and for understanding the evolution of biological functions of these enzymes.
Abstract: NADPH-oxidases (Nox) and the related Dual oxidases (Duox) play varied biological and pathological roles via regulated generation of reactive oxygen species (ROS). Members of the Nox/Duox family have been identified in a wide variety of organisms, including mammals, nematodes, fruit fly, green plants, fungi, and slime molds; however, little is known about the molecular evolutionary history of these enzymes. We assembled and analyzed the deduced amino acid sequences of 101 Nox/Duox orthologs from 25 species, including vertebrates, urochordates, echinoderms, insects, nematodes, fungi, slime mold amoeba, alga and plants. In contrast to ROS defense enzymes, such as superoxide dismutase and catalase that are present in prokaryotes, ROS-generating Nox/Duox orthologs only appeared later in evolution. Molecular taxonomy revealed seven distinct subfamilies of Noxes and Duoxes. The calcium-regulated orthologs representing 4 subfamilies diverged early and are the most widely distributed in biology. Subunit-regulated Noxes represent a second major subdivision, and appeared first in fungi and amoeba. Nox5 was lost in rodents, and Nox3, which functions in the inner ear in gravity perception, emerged the most recently, corresponding to full-time adaptation of vertebrates to land. The sea urchin Strongylocentrotus purpuratus possesses the earliest Nox2 co-ortholog of vertebrate Nox1, 2, and 3, while Nox4 first appeared somewhat later in urochordates. Comparison of evolutionary substitution rates demonstrates that Nox2, the regulatory subunits p47phox and p67phox, and Duox are more stringently conserved in vertebrates than other Noxes and Nox regulatory subunits. Amino acid sequence comparisons identified key catalytic or regulatory regions, as 68 residues were highly conserved among all Nox/Duox orthologs, and 14 of these were identical with those mutated in Nox2 in variants of X-linked chronic granulomatous disease. In addition to canonical motifs, the B-loop, TM6-FAD, VXGPFG-motif, and extreme C-terminal regions were identified as important for Nox activity, as verified by mutational analysis. The presence of these non-canonical, but highly conserved regions suggests that all Nox/Duox may possess a common biological function remained in a long history of Nox/Duox evolution. This report provides the first comprehensive analysis of the evolution and conserved functions of Nox and Duox family members, including identification of conserved amino acid residues. These results provide a guide for future structure-function studies and for understanding the evolution of biological functions of these enzymes.

302 citations


Journal ArticleDOI
TL;DR: Using multiple genes and explicit hypothesis testing, it is shown that Echiura, Siboglinidae, and Clitellata are derived annelid with polychaete sister taxa, and that Sipuncula should be included within annelids.
Abstract: Annelida comprises an ancient and ecologically important animal phylum with over 16,500 described species and members are the dominant macrofauna of the deep sea. Traditionally, two major groups are distinguished: Clitellata (including earthworms, leeches) and "Polychaeta" (mostly marine worms). Recent analyses of molecular data suggest that Annelida may include other taxa once considered separate phyla (i.e., Echiura, and Sipuncula) and that Clitellata are derived annelids, thus rendering "Polychaeta" paraphyletic; however, this contradicts classification schemes of annelids developed from recent analyses of morphological characters. Given that deep-level evolutionary relationships of Annelida are poorly understood, we have analyzed comprehensive datasets based on nuclear and mitochondrial genes, and have applied rigorous testing of alternative hypotheses so that we can move towards the robust reconstruction of annelid history needed to interpret animal body plan evolution. Sipuncula, Echiura, Siboglinidae, and Clitellata are all nested within polychaete annelids according to phylogenetic analyses of three nuclear genes (18S rRNA, 28S rRNA, EF1α; 4552 nucleotide positions analyzed) for 81 taxa, and 11 nuclear and mitochondrial genes for 10 taxa (additional: 12S rRNA, 16S rRNA, ATP8, COX1-3, CYTB, NAD6; 11,454 nucleotide positions analyzed). For the first time, these findings are substantiated using approximately unbiased tests and non-scaled bootstrap probability tests that compare alternative hypotheses. For echiurans, the polychaete group Capitellidae is corroborated as the sister taxon; while the exact placement of Sipuncula within Annelida is still uncertain, our analyses suggest an affiliation with terebellimorphs. Siboglinids are in a clade with other sabellimorphs, and clitellates fall within a polychaete clade with aeolosomatids as their possible sister group. None of our analyses support the major polychaete clades reflected in the current classification scheme of annelids, and hypothesis testing significantly rejects monophyly of Scolecida, Palpata, Canalipalpata, and Aciculata. Using multiple genes and explicit hypothesis testing, we show that Echiura, Siboglinidae, and Clitellata are derived annelids with polychaete sister taxa, and that Sipuncula should be included within annelids. The traditional composition of Annelida greatly underestimates the morphological diversity of this group, and inclusion of Sipuncula and Echiura implies that patterns of segmentation within annelids have been evolutionarily labile. Relationships within Annelida based on our analyses of multiple genes challenge the current classification scheme, and some alternative hypotheses are provided.

296 citations


Journal ArticleDOI
TL;DR: The reconstruction of the phylogenetic relationship of NRPS C domain subtypes is reported and the sequence motifs of recently discovered subtypes (Dual E/C, DCL and Starter domains) and their characteristic sequence differences are analyzed, mutually and in comparison with LCL domains.
Abstract: Non-ribosomal peptide synthetases (NRPSs) are large multimodular enzymes that synthesize a wide range of biologically active natural peptide compounds, of which many are pharmacologically important. Peptide bond formation is catalyzed by the Condensation (C) domain. Various functional subtypes of the C domain exist: An LCL domain catalyzes a peptide bond between two L-amino acids, a DCL domain links an L-amino acid to a growing peptide ending with a D-amino acid, a Starter C domain (first denominated and classified as a separate subtype here) acylates the first amino acid with a β-hydroxy-carboxylic acid (typically a β-hydroxyl fatty acid), and Heterocyclization (Cyc) domains catalyze both peptide bond formation and subsequent cyclization of cysteine, serine or threonine residues. The homologous Epimerization (E) domain flips the chirality of the last amino acid in the growing peptide; Dual E/C domains catalyze both epimerization and condensation. In this paper, we report on the reconstruction of the phylogenetic relationship of NRPS C domain subtypes and analyze in detail the sequence motifs of recently discovered subtypes (Dual E/C, DCL and Starter domains) and their characteristic sequence differences, mutually and in comparison with LCL domains. Based on their phylogeny and the comparison of their sequence motifs, LCL and Starter domains appear to be more closely related to each other than to other subtypes, though pronounced differences in some segments of the protein account for the unequal donor substrates (amino vs. β-hydroxy-carboxylic acid). Furthermore, on the basis of phylogeny and the comparison of sequence motifs, we conclude that Dual E/C and DCL domains share a common ancestor. In the same way, the evolutionary origin of a C domain of unknown function in glycopeptide (GP) NRPSs can be determined to be an LCL domain. In the case of two GP C domains which are most similar to DCL but which have LCL activity, we postulate convergent evolution. We systematize all C domain subtypes including the novel Starter C domain. With our results, it will be easier to decide the subtype of unknown C domains as we provide profile Hidden Markov Models (pHMMs) for the sequence motifs as well as for the entire sequences. The determined specificity conferring positions will be helpful for the mutation of one subtype into another, e.g. turning DCL to LCL, which can be a useful step for obtaining novel products.

284 citations


Journal ArticleDOI
TL;DR: The taxonomic relationship of E. sakazakii was further investigated in this article, where the species was defined in 1980, 15 biogroups were described and it was suggested that these could represent multiple species.
Abstract: Enterobacter sakazakii is an opportunistic pathogen that can cause infections such as necrotizing enterocolitis, bacteraemia, meningitis and brain abscess/lesions. When the species was defined in 1980, 15 biogroups were described and it was suggested that these could represent multiple species. In this study the taxonomic relationship of strains described as E. sakazakii was further investigated. Strains identified as E. sakazakii were divided into separate groups on the basis of f-AFLP fingerprints, ribopatterns and full-length 16S rRNA gene sequences. DNA-DNA hybridizations revealed five genomospecies. The phenotypic profiles of the genomospecies were determined and biochemical markers identified. This study clarifies the taxonomy of E. sakazakii and proposes a reclassification of these organisms.

276 citations


Journal ArticleDOI
TL;DR: The first evolutionary analysis of a whole superfamily of transcription factors, the basic helix-loop-helix (bHLH) proteins, at the scale of the whole metazoan kingdom is reported, suggesting that these features may be extended to other developmental gene families and reflect a general trend in the evolution of the developmental gene repertoires of metazoans.
Abstract: Molecular and genetic analyses conducted in model organisms such as Drosophila and vertebrates, have provided a wealth of information about how networks of transcription factors control the proper development of these species. Much less is known, however, about the evolutionary origin of these elaborated networks and their large-scale evolution. Here we report the first evolutionary analysis of a whole superfamily of transcription factors, the basic helix-loop-helix (bHLH) proteins, at the scale of the whole metazoan kingdom. We identified in silico the putative full complement of bHLH genes in the sequenced genomes of 12 different species representative of the main metazoan lineages, including three non-bilaterian metazoans, the cnidarians Nematostella vectensis and Hydra magnipapillata and the demosponge Amphimedon queenslandica. We have performed extensive phylogenetic analyses of the 695 identified bHLHs, which has allowed us to allocate most of these bHLHs to defined evolutionary conserved groups of orthology. Three main features in the history of the bHLH gene superfamily can be inferred from these analyses: (i) an initial diversification of the bHLHs has occurred in the pre-Cambrian, prior to metazoan cladogenesis; (ii) a second expansion of the bHLH superfamily occurred early in metazoan evolution before bilaterians and cnidarians diverged; and (iii) the bHLH complement during the evolution of the bilaterians has been remarkably stable. We suggest that these features may be extended to other developmental gene families and reflect a general trend in the evolution of the developmental gene repertoires of metazoans.

273 citations


Journal ArticleDOI
TL;DR: The studies provide an evolutionary history for this important family of proteins and a framework and consistent nomenclature for comparison of septin orthologs across kingdoms.
Abstract: Septins are cytoskeletal GTPase proteins first discovered in the fungus Saccharomyces cerevisiae where they organize the septum and link nuclear division with cell division. More recently septins have been found in animals where they are important in processes ranging from actin and microtubule organization to embryonic patterning and where defects in septins have been implicated in human disease. Previous studies suggested that many animal septins fell into independent evolutionary groups, confounding cross-kingdom comparison. In the current work, we identified 162 septins from fungi, microsporidia and animals and analyzed their phylogenetic relationships. There was support for five groups of septins with orthology between kingdoms. Group 1 (which includes S. cerevisiae Cdc10p and human Sept9) and Group 2 (which includes S. cerevisiae Cdc3p and human Sept7) contain sequences from fungi and animals. Group 3 (which includes S. cerevisiae Cdc11p) and Group 4 (which includes S. cerevisiae Cdc12p) contain sequences from fungi and microsporidia. Group 5 (which includes Aspergillus nidulans AspE) contains sequences from filamentous fungi. We suggest a modified nomenclature based on these phylogenetic relationships. Comparative sequence alignments revealed septin derivatives of already known G1, G3 and G4 GTPase motifs, four new motifs from two to twelve amino acids long and six conserved single amino acid positions. One of these new motifs is septin-specific and several are group specific. Our studies provide an evolutionary history for this important family of proteins and a framework and consistent nomenclature for comparison of septin orthologs across kingdoms.

270 citations


Journal ArticleDOI
TL;DR: Hybridization between species of Heliconius appears to be a natural phenomenon; there is no evidence that it has been enhanced by recent human habitat disturbance, and this finding concurs with the view that processes leading to speciation are continuous, rather than sudden, and that they are the same as those operating within species,rather than requiring special punctuated effects or complete allopatry.
Abstract: To understand speciation and the maintenance of taxa as separate entities, we need information about natural hybridization and gene flow among species. Interspecific hybrids occur regularly in Heliconius and Eueides (Lepidoptera: Nymphalidae) in the wild: 26–29% of the species of Heliconiina are involved, depending on species concept employed. Hybridization is, however, rare on a per-individual basis. For one well-studied case of species hybridizing in parapatric contact (Heliconius erato and H. himera), phenotypically detectable hybrids form around 10% of the population, but for species in sympatry hybrids usually form less than 0.05% of individuals. There is a roughly exponential decline with genetic distance in the numbers of natural hybrids in collections, both between and within species, suggesting a simple "exponential failure law" of compatibility as found in some prokaryotes. Hybridization between species of Heliconius appears to be a natural phenomenon; there is no evidence that it has been enhanced by recent human habitat disturbance. In some well-studied cases, backcrossing occurs in the field and fertile backcrosses have been verified in insectaries, which indicates that introgression is likely, and recent molecular work shows that alleles at some but not all loci are exchanged between pairs of sympatric, hybridizing species. Molecular clock dating suggests that gene exchange may continue for more than 3 million years after speciation. In addition, one species, H. heurippa, appears to have formed as a result of hybrid speciation. Introgression may often contribute to adaptive evolution as well as sometimes to speciation itself, via hybrid speciation. Geographic races and species that coexist in sympatry therefore form part of a continuum in terms of hybridization rates or probability of gene flow. This finding concurs with the view that processes leading to speciation are continuous, rather than sudden, and that they are the same as those operating within species, rather than requiring special punctuated effects or complete allopatry. Although not qualitatively distinct from geographic races, nor "real" in terms of phylogenetic species concepts or the biological species concept, hybridizing species of Heliconius are stably distinct in sympatry, and remain useful groups for predicting morphological, ecological, behavioural and genetic characteristics.

Journal ArticleDOI
TL;DR: Find that the human GH18 gene family is closely linked to the human major histocompatibility complex paralogon on chromosome 1, together with the recent association of GH18 chitinase activity with Th2 cell inflammation, suggests that its late expansion could be related to an emerging interface of innate and adaptive immunity during early vertebrate history.
Abstract: Chitinases (EC.3.2.1.14) hydrolyze the β-1,4-linkages in chitin, an abundant N-acetyl-β-D-glucosamine polysaccharide that is a structural component of protective biological matrices such as insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 (GH18) family of chitinases is an ancient gene family widely expressed in archea, prokaryotes and eukaryotes. Mammals are not known to synthesize chitin or metabolize it as a nutrient, yet the human genome encodes eight GH18 family members. Some GH18 proteins lack an essential catalytic glutamic acid and are likely to act as lectins rather than as enzymes. This study used comparative genomic analysis to address the evolutionary history of the GH18 multiprotein family, from early eukaryotes to mammals, in an effort to understand the forces that shaped the human genome content of chitinase related proteins. Gene duplication and loss according to a birth-and-death model of evolution is a feature of the evolutionary history of the GH18 family. The current human family likely originated from ancient genes present at the time of the bilaterian expansion (approx. 550 mya). The family expanded in the chitinous protostomes C. elegans and D. melanogaster, declined in early deuterostomes as chitin synthesis disappeared, and expanded again in late deuterostomes with a significant increase in gene number after the avian/mammalian split. This comprehensive genomic study of animal GH18 proteins reveals three major phylogenetic groups in the family: chitobiases, chitinases/chitolectins, and stabilin-1 interacting chitolectins. Only the chitinase/chitolectin group is associated with expansion in late deuterostomes. Finding that the human GH18 gene family is closely linked to the human major histocompatibility complex paralogon on chromosome 1, together with the recent association of GH18 chitinase activity with Th2 cell inflammation, suggests that its late expansion could be related to an emerging interface of innate and adaptive immunity during early vertebrate history.

Journal ArticleDOI
TL;DR: Clarifying species identifications permits a more accurate assessment of introduction histories and distributions, and provides a very different picture of the tempo and pattern of invasions than was inferred when the three species with channeled sutures were considered one.
Abstract: Since the mid 1990s populations of non-native apple snails (Ampullariidae) have been discovered with increasing frequency in the continental United States Given the dramatic effects that introduced apple snails have had on both natural habitats and agricultural areas in Southeast Asia, their introduction to the mainland US is cause for concern We combine phylogenetic analyses of mtDNA sequences with examination of introduced populations and museum collections to clarify the identities, introduced distributions, geographical origins, and introduction histories of apple snails Based on sampling to date, we conclude there are five species of non-native apple snails in the continental US Most significantly, we recognize three species within what has been called the channeled apple snail: Pomacea canaliculata (California and Arizona), Pomacea insularum, (Florida, Texas, and Georgia) and Pomacea haustrum (Florida) The first established populations of P haustrum were discovered in the late 1970s in Palm Beach County Florida, and have not spread appreciably in 30 years In contrast, populations of P insularum were established in Texas by 1989, in Florida by the mid to late 1990s, and in Georgia by 2005, and this species continues to spread rapidly Most introduced P insularum haplotypes are a close match to haplotypes from the Rio Uruguay near Buenos Aires, indicating cold tolerance, with the potential to spread from Florida, Georgia, and Texas through Louisiana, Alabama, Mississippi, and South Carolina Pomacea canaliculata populations were first discovered in California in 1997 Haplotypes of introduced P canaliculata match native-range haplotypes from near Buenos Aires, Argentina, also indicating cold tolerance and the potential to establish farther north The term "channeled apple snail" is descriptive of a morphology found in many apple snail species It does not identify a single species or a monophyletic group Clarifying species identifications permits a more accurate assessment of introduction histories and distributions, and provides a very different picture of the tempo and pattern of invasions than was inferred when the three species with channeled sutures were considered one Matching introduced and native-range haplotypes suggests the potential for range expansion, with implications for native aquatic ecosystems and species, agriculture, and human health

Journal ArticleDOI
TL;DR: Overall, the sex specific genetic signature of different postmarital habits of residence in the Hill Tribes is robust, and specific perturbations related to linguistic differences, population specific traits, and the complex migratory history of these groups can be identified.
Abstract: Background Ethnic minorities in Northern Thailand, often referred to as Hill Tribes, are considered an ideal model to study the different genetic impact of sex-specific migration rates expected in matrilocal (women remain in their natal villages after the marriage and men move to their wife's village) and patrilocal societies (the opposite is true). Previous studies identified such differences, but little is known about the possible interaction with another cultural factor that may potentially affect genetic diversity, i.e. linguistic differences. In addition, Hill Tribes started to migrate to Thailand in the last centuries from different Northern areas, but the history of these migrations, the level of genetic legacy with their places of origin, and the possible confounding effects related to this migration history in the patterns of genetic diversity, have not been analysed yet. Using both original and published data on the Hill Tribes and several other Asian populations, we focused on all these aspects.

Journal ArticleDOI
TL;DR: The extremes of synonymous substitution rates measured here constitute by far the largest known range of rate variation for any group of organisms and highlight the utility of examining absolute substitution rates in a phylogenetic context rather than by traditional pairwise methods.
Abstract: It has long been known that rates of synonymous substitutions are unusually low in mitochondrial genes of flowering and other land plants. Although two dramatic exceptions to this pattern have recently been reported, it is unclear how often major increases in substitution rates occur during plant mitochondrial evolution and what the overall magnitude of substitution rate variation is across plants. A broad survey was undertaken to evaluate synonymous substitution rates in mitochondrial genes of angiosperms and gymnosperms. Although most taxa conform to the generality that plant mitochondrial sequences evolve slowly, additional cases of highly accelerated rates were found. We explore in detail one of these new cases, within the genus Silene. A roughly 100-fold increase in synonymous substitution rate is estimated to have taken place within the last 5 million years and involves only one of ten species of Silene sampled in this study. Examples of unusually slow sequence evolution were also identified. Comparison of the fastest and slowest lineages shows that synonymous substitution rates vary by four orders of magnitude across seed plants. In other words, some plant mitochondrial lineages accumulate more synonymous change in 10,000 years than do others in 100 million years. Several perplexing cases of gene-to-gene variation in sequence divergence within a plant were uncovered. Some of these probably reflect interesting biological phenomena, such as horizontal gene transfer, mitochondrial-to-nucleus transfer, and intragenomic variation in mitochondrial substitution rates, whereas others are likely the result of various kinds of errors. The extremes of synonymous substitution rates measured here constitute by far the largest known range of rate variation for any group of organisms. These results highlight the utility of examining absolute substitution rates in a phylogenetic context rather than by traditional pairwise methods. Why substitution rates are generally so low in plant mitochondrial genomes yet occasionally increase dramatically remains mysterious.

Journal ArticleDOI
TL;DR: The results of a phylogenetic analysis of 102 species of seed plants indicate that the parasitic lifestyle has arisen repeatedly in angiosperm evolutionary history and results in increasing parasite genomic chimerism over time.
Abstract: Background Some of the most difficult phylogenetic questions in evolutionary biology involve identification of the free-living relatives of parasitic organisms, particularly those of parasitic flowering plants. Consequently, the number of origins of parasitism and the phylogenetic distribution of the heterotrophic lifestyle among angiosperm lineages is unclear.

Journal ArticleDOI
TL;DR: The results suggest that simultaneous migration of hosts and parasites can dramatically affect the interaction of host and parasite, and the organism with the lower evolutionary potential may gain the greater evolutionary advantage from migration.
Abstract: The dynamics of antagonistic host-parasite coevolution are believed to be crucially dependent on the rate of migration between populations. We addressed how the rate of simultaneous migration of host and parasite affected resistance and infectivity evolution of coevolving meta-populations of the bacterium Pseudomonas fluorescens and a viral parasite (bacteriophage). The increase in genetic variation resulting from small amounts of migration is expected to increase rates of adaptation of both host and parasite. However, previous studies suggest phages should benefit more from migration than bacteria; because in the absence of migration, phages are more genetically limited and have a lower evolutionary potential compared to the bacteria. The results supported the hypothesis: migration increased the resistance of bacteria to their local (sympatric) hosts. Moreover, migration benefited phages more than hosts with respect to 'global' (measured with respect to the whole range of migration regimes) patterns of resistance and infectivity, because of the differential evolutionary responses of bacteria and phage to different migration regimes. Specifically, we found bacterial global resistance peaked at intermediate rates of migration, whereas phage global infectivity plateaued when migration rates were greater than zero. These results suggest that simultaneous migration of hosts and parasites can dramatically affect the interaction of host and parasite. More specifically, the organism with the lower evolutionary potential may gain the greater evolutionary advantage from migration.

Journal ArticleDOI
Vincent J. Lynch1
TL;DR: Data show that increases in genomic complexity can lead to phenotypic complexity (venom composition) and that positive Darwinian selection is a common evolutionary force in snake venoms and regions identified on the surface of phospholipase A2 enzymes are potential candidate sites for structure based antivenin design.
Abstract: Gene duplication followed by functional divergence has long been hypothesized to be the main source of molecular novelty. Convincing examples of neofunctionalization, however, remain rare. Snake venom phospholipase A2 genes are members of large multigene families with many diverse functions, thus they are excellent models to study the emergence of novel functions after gene duplications. Here, I show that positive Darwinian selection and neofunctionalization is common in snake venom phospholipase A2 genes. The pattern of gene duplication and positive selection indicates that adaptive molecular evolution occurs immediately after duplication events as novel functions emerge and continues as gene families diversify and are refined. Surprisingly, adaptive evolution of group-I phospholipases in elapids is also associated with speciation events, suggesting adaptation of the phospholipase arsenal to novel prey species after niche shifts. Mapping the location of sites under positive selection onto the crystal structure of phospholipase A2 identified regions evolving under diversifying selection are located on the molecular surface and are likely protein-protein interactions sites essential for toxin functions. These data show that increases in genomic complexity (through gene duplications) can lead to phenotypic complexity (venom composition) and that positive Darwinian selection is a common evolutionary force in snake venoms. Finally, regions identified under selection on the surface of phospholipase A2 enzymes are potential candidate sites for structure based antivenin design.

Journal ArticleDOI
TL;DR: This study studies the relation between environmental variability and modularity in a natural and well-studied system, the metabolic networks of bacteria to find that metabolic Networks of organisms in variable environments are significantly more modular than networks of organisms that evolved under more constant conditions.
Abstract: Biological systems are often modular: they can be decomposed into nearly-independent structural units that perform specific functions. The evolutionary origin of modularity is a subject of much current interest. Recent theory suggests that modularity can be enhanced when the environment changes over time. However, this theory has not yet been tested using biological data. To address this, we studied the relation between environmental variability and modularity in a natural and well-studied system, the metabolic networks of bacteria. We classified 117 bacterial species according to the degree of variability in their natural habitat. We find that metabolic networks of organisms in variable environments are significantly more modular than networks of organisms that evolved under more constant conditions. This study supports the view that variability in the natural habitat of an organism promotes modularity in its metabolic network and perhaps in other biological systems.

Journal ArticleDOI
TL;DR: The results suggest that natural selection has acted on codon usage in the genus Drosophila, at least often enough to leave a footprint of selection in modern genomes.
Abstract: Codon usage bias (CUB), the uneven use of synonymous codons, is a ubiquitous observation in virtually all organisms examined. The pattern of codon usage is generally similar among closely related species, but differs significantly among distantly related organisms, e.g., bacteria, yeast, and Drosophila. Several explanations for CUB have been offered and some have been supported by observations and experiments, although a thorough understanding of the evolutionary forces (random drift, mutation bias, and selection) and their relative importance remains to be determined. The recently available complete genome DNA sequences of twelve phylogenetically defined species of Drosophila offer a hitherto unprecedented opportunity to examine these problems. We report here the patterns of codon usage in the twelve species and offer insights on possible evolutionary forces involved. (1) Codon usage is quite stable across 11/12 of the species: G- and especially C-ending codons are used most frequently, thus defining the preferred codons. (2) The only amino acid that changes in preferred codon is Serine with six species of the melanogaster group favoring TCC while the other species, particularly subgenus Drosophila species, favor AGC. (3) D. willistoni is an exception to these generalizations in having a shifted codon usage for seven amino acids toward A/T in the wobble position. (4) Amino acids differ in their contribution to overall CUB, Leu having the greatest and Asp the least. (5) Among two-fold degenerate amino acids, A/G ending amino acids have more selection on codon usage than T/C ending amino acids. (6) Among the different chromosome arms or elements, genes on the non-recombining element F (dot chromosome) have the least CUB, while genes on the element A (X chromosome) have the most. (7) Introns indicate that mutation bias in all species is approximately 2:1, AT:GC, the opposite of codon usage bias. (8) There is also evidence for some overall regional bias in base composition that may influence codon usage. Overall, these results suggest that natural selection has acted on codon usage in the genus Drosophila, at least often enough to leave a footprint of selection in modern genomes. However, there is evidence in the data that random forces (drift and mutation) have also left patterns in the data, especially in genes under weak selection for codon usage for example genes in regions of low recombination. The documentation of codon usage patterns in each of these twelve genomes also aids in ongoing annotation efforts.

Journal ArticleDOI
TL;DR: It is suggested that cyanobacterial associations with protists, like the Rhopalodia gibba-spheroid body symbiosis, could serve as important model systems for the investigation of the complex mechanisms underlying organelle evolution.
Abstract: Nitrogen, a component of many bio-molecules, is essential for growth and development of all organisms. Most nitrogen exists in the atmosphere, and utilisation of this source is important as a means of avoiding nitrogen starvation. However, the ability to fix atmospheric nitrogen via the nitrogenase enzyme complex is restricted to some bacteria. Eukaryotic organisms are only able to obtain fixed nitrogen through their symbiotic interactions with nitrogen-fixing prokaryotes. These symbioses involve a variety of host organisms, including animals, plants, fungi and protists. We have compared the morphological, physiological and molecular characteristics of nitrogen fixing symbiotic associations of bacteria and their diverse hosts. Special features of the interaction, e.g. vertical transmission of symbionts, grade of dependency of partners and physiological modifications have been considered in terms of extent of co-evolution and adaptation. Our findings are that, despite many adaptations enabling a beneficial partnership, most symbioses for molecular nitrogen fixation involve facultative interactions. However, some interactions, among them endosymbioses between cyanobacteria and diatoms, show characteristics that reveal a more obligate status of co-evolution. Our review emphasises that molecular nitrogen fixation, a driving force for interactions and co-evolution of different species, is a widespread phenomenon involving many different organisms and ecosystems. The diverse grades of symbioses, ranging from loose associations to highly specific intracellular interactions, might themselves reflect the range of potential evolutionary fates for symbiotic partnerships. These include the extreme evolutionary modifications and adaptations that have accompanied the formation of organelles in eukaryotic cells: plastids and mitochondria. However, age and extensive adaptation of plastids and mitochondria complicate the investigation of processes involved in the transition of symbionts to organelles. Extant lineages of symbiotic associations for nitrogen fixation show diverse grades of adaptation and co-evolution, thereby representing different stages of symbiont-host interaction. In particular cyanobacterial associations with protists, like the Rhopalodia gibba-spheroid body symbiosis, could serve as important model systems for the investigation of the complex mechanisms underlying organelle evolution.

Journal ArticleDOI
TL;DR: These data further support a highly complex LCEA and indicate that the basic architecture of the trafficking system is remarkably conserved and ancient, with the SM proteins and tethering factors having originated very early in eukaryotic evolution.
Abstract: In membrane trafficking, the mechanisms ensuring vesicle fusion specificity remain to be fully elucidated. Early models proposed that specificity was encoded entirely by SNARE proteins; more recent models include contributions from Rab proteins, Syntaxin-binding (SM) proteins and tethering factors. Most information on membrane trafficking derives from an evolutionarily narrow sampling of model organisms. However, considering factors from a wider diversity of eukaryotes can provide both functional information on core systems and insight into the evolutionary history of the trafficking machinery. For example, the major Qa/syntaxin SNARE families are present in most eukaryotic genomes and likely each evolved via gene duplication from a single ancestral syntaxin before the existing eukaryotic groups diversified. This pattern is also likely for Rabs and various other components of the membrane trafficking machinery. We performed comparative genomic and phylogenetic analyses, when relevant, on the SM proteins and components of the tethering complexes, both thought to contribute to vesicle fusion specificity. Despite evidence suggestive of secondary losses amongst many lineages, the tethering complexes are well represented across the eukaryotes, suggesting an origin predating the radiation of eukaryotic lineages. Further, whilst we detect distant sequence relations between GARP, COG, exocyst and DSL1 components, these similarities most likely reflect convergent evolution of similar secondary structural elements. No similarity is found between the TRAPP and HOPS complexes and the other tethering factors. Overall, our data favour independent origins for the various tethering complexes. The taxa examined possess at least one homologue of each of the four SM protein families; since the four monophyletic families each encompass a wide diversity of eukaryotes, the SM protein families very likely evolved before the last common eukaryotic ancestor (LCEA). These data further support a highly complex LCEA and indicate that the basic architecture of the trafficking system is remarkably conserved and ancient, with the SM proteins and tethering factors having originated very early in eukaryotic evolution. However, the independent origin of the tethering complexes suggests a novel pattern for increasing complexity in the membrane trafficking system, in addition to the pattern of paralogous machinery elaboration seen thus far.

Journal ArticleDOI
TL;DR: The results support an important role of the FSGD and other types of duplication in the evolution of pigmentation in fish, and suggest teleost fishes apparently have a greater repertoire of pigment synthesis genes than any other vertebrate group.
Abstract: Coloration and color patterning belong to the most diverse phenotypic traits in animals. Particularly, teleost fishes possess more pigment cell types than any other group of vertebrates. As the result of an ancient fish-specific genome duplication (FSGD), teleost genomes might contain more copies of genes involved in pigment cell development than tetrapods. No systematic genomic inventory allowing to test this hypothesis has been drawn up so far for pigmentation genes in fish, and almost nothing is known about the evolution of these genes in different fish lineages. Using a comparative genomic approach including phylogenetic reconstructions and synteny analyses, we have studied two major pigment synthesis pathways in teleost fish, the melanin and the pteridine pathways, with respect to different types of gene duplication. Genes encoding three of the four enzymes involved in the synthesis of melanin from tyrosine have been retained as duplicates after the FSGD. In the pteridine pathway, two cases of duplicated genes originating from the FSGD as well as several lineage-specific gene duplications were observed. In both pathways, genes encoding the rate-limiting enzymes, tyrosinase and GTP-cyclohydrolase I (GchI), have additional paralogs in teleosts compared to tetrapods, which have been generated by different modes of duplication. We have also observed a previously unrecognized diversity of gchI genes in vertebrates. In addition, we have found evidence for divergent resolution of duplicated pigmentation genes, i.e., differential gene loss in divergent teleost lineages, particularly in the tyrosinase gene family. Mainly due to the FSGD, teleost fishes apparently have a greater repertoire of pigment synthesis genes than any other vertebrate group. Our results support an important role of the FSGD and other types of duplication in the evolution of pigmentation in fish.

Journal ArticleDOI
TL;DR: Based on a large collection of EST sequences, evidence is provided that the haploid moss Physcomitrella patens is a paleopolyploid as well and metabolic genes seem to have been retained in excess following the genome duplication in P. patens.
Abstract: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants.

Journal ArticleDOI
TL;DR: SCaFoS is the first tool that integrates user's knowledge to select orthologous sequences, creates chimerical sequences to reduce missing data and selects genes according to their level of missing data, showing that the judicious selection of genes, species and sequences reduces tree reconstruction artefacts, especially if the dataset includes fast evolving species.
Abstract: Phylogenetic analyses based on datasets rich in both genes and species (phylogenomics) are becoming a standard approach to resolve evolutionary questions. However, several difficulties are associated with the assembly of large datasets, such as multiple copies of a gene per species (paralogous or xenologous genes), lack of some genes for a given species, or partial sequences. The use of undetected paralogous or xenologous genes in phylogenetic inference can lead to inaccurate results, and the use of partial sequences to a lack of resolution. A tool that selects sequences, species, and genes, while dealing with these issues, is needed in a phylogenomics context. Here, we present SCaFoS, a tool that quickly assembles phylogenomic datasets containing maximal phylogenetic information while adjusting the amount of missing data in the selection of species, sequences and genes. Starting from individual sequence alignments, and using monophyletic groups defined by the user, SCaFoS creates chimeras with partial sequences, or selects, among multiple sequences, the orthologous and/or slowest evolving sequences. Once sequences representing each predefined monophyletic group have been selected, SCaFos retains genes according to the user's allowed level of missing data and generates files for super-matrix and super-tree analyses in several formats compatible with standard phylogenetic inference software. Because no clear-cut criteria exist for the sequence selection, a semi-automatic mode is available to accommodate user's expertise. SCaFos is able to deal with datasets of hundreds of species and genes, both at the amino acid or nucleotide level. It has a graphical interface and can be integrated in an automatic workflow. Moreover, SCaFoS is the first tool that integrates user's knowledge to select orthologous sequences, creates chimerical sequences to reduce missing data and selects genes according to their level of missing data. Finally, applying SCaFoS to different datasets, we show that the judicious selection of genes, species and sequences reduces tree reconstruction artefacts, especially if the dataset includes fast evolving species.

Journal ArticleDOI
TL;DR: The data indicate that natural hybridization has occurred at relatively low rates, and a partial congruence between phenotypically and genetically intermediate individuals was found, suggesting that intermediate appearance does not necessarily mean hybridization.
Abstract: Analysis of interspecific gene flow is crucial for the understanding of speciation processes and maintenance of species integrity. Oaks (genus Quercus, Fagaceae) are among the model species for the study of hybridization. Natural co-occurrence of four closely related oak species is a very rare case in the temperate forests of Europe. We used both morphological characters and genetic markers to characterize hybridization in a natural community situated in west-central Romania and which consists of Quercus robur, Q. petraea, Q. pubescen s, and Q. frainetto, respectively. On the basis of pubescence and leaf morphological characters ~94% of the sampled individuals were assigned to pure species. Only 16 (~6%) individual trees exhibited intermediate morphologies or a combination of characters of different species. Four chloroplast DNA haplotypes were identified in the study area. The distribution of haplotypes within the white oak complex showed substantial differences among species. However, the most common haplotypes were present in all four species. Furthermore, based on a set of 7 isozyme and 6 microsatellite markers and using a Bayesian admixture analysis without any a priori information on morphology we found that four genetic clusters best fit the data. There was a very good correspondence of each species with one of the inferred genetic clusters. The estimated introgression level varied markedly between pairs of species ranging from 1.7% between Q. robur and Q. frainetto to 16.2% between Q. pubescens and Q. frainetto. Only nine individuals (3.4%) appeared to be first-generation hybrids. Our data indicate that natural hybridization has occurred at relatively low rates. The different levels of gene flow among species might be explained by differences in flowering time and spatial position within the stand. In addition, a partial congruence between phenotypically and genetically intermediate individuals was found, suggesting that intermediate appearance does not necessarily mean hybridization. However, it appears that natural hybridization did not seriously affect the species identity in this area of sympatry.

Journal ArticleDOI
TL;DR: The finding of the reciprocal paraphyly of Hexapoda and Crustacea suggests an evolutionary scenario in which the acquisition of the hexapod condition may have occurred several times independently in lineages descending from different crustacean-like ancestors, possibly as a consequence of the process of terrestrialization.
Abstract: The phylogeny of Arthropoda is still a matter of harsh debate among systematists, and significant disagreement exists between morphological and molecular studies. In particular, while the taxon joining hexapods and crustaceans (the Pancrustacea) is now widely accepted among zoologists, the relationships among its basal lineages, and particularly the supposed reciprocal paraphyly of Crustacea and Hexapoda, continues to represent a challenge. Several genes, as well as different molecular markers, have been used to tackle this problem in molecular phylogenetic studies, with the mitochondrial DNA being one of the molecules of choice. In this study, we have assembled the largest data set available so far for Pancrustacea, consisting of 100 complete (or almost complete) sequences of mitochondrial genomes. After removal of unalignable sequence regions and highly rearranged genomes, we used nucleotide and inferred amino acid sequences of the 13 protein coding genes to reconstruct the phylogenetic relationships among major lineages of Pancrustacea. The analysis was performed with Bayesian inference, and for the amino acid sequences a new, Pancrustacea-specific, matrix of amino acid replacement was developed and used in this study. Two largely congruent trees were obtained from the analysis of nucleotide and amino acid datasets. In particular, the best tree obtained based on the new matrix of amino acid replacement (MtPan) was preferred over those obtained using previously available matrices (MtArt and MtRev) because of its higher likelihood score. The most remarkable result is the reciprocal paraphyly of Hexapoda and Crustacea, with some lineages of crustaceans (namely the Malacostraca, Cephalocarida and, possibly, the Branchiopoda) being more closely related to the Insecta s.s. (Ectognatha) than two orders of basal hexapods, Collembola and Diplura. Our results confirm that the mitochondrial genome, unlike analyses based on morphological data or nuclear genes, consistently supports the non monophyly of Hexapoda. The finding of the reciprocal paraphyly of Hexapoda and Crustacea suggests an evolutionary scenario in which the acquisition of the hexapod condition may have occurred several times independently in lineages descending from different crustacean-like ancestors, possibly as a consequence of the process of terrestrialization. If this hypothesis was confirmed, we should therefore re-think our interpretation of the evolution of the Arthropoda, where terrestrialization may have led to the acquisition of similar anatomical features by convergence. At the same time, the disagreement between reconstructions based on morphological, nuclear and mitochondrial data sets seems to remain, despite the use of larger data sets and more powerful analytical methods.

Journal ArticleDOI
TL;DR: From the direct observation of hybrids, it is concluded that hybridization between distantly related gastropod-shell-breeding cichlids of Lake Tanganyika follows inevitably from their ecological specialization.
Abstract: The tribe Lamprologini is the major substrate breeding lineage of Lake Tanganyika's cichlid species flock. Among several different life history strategies found in lamprologines, the adaptation to live and breed in empty gastropod shells is probably the most peculiar. Although shell-breeding arose several times in the evolutionary history of the lamprologines, all obligatory and most facultative shell-breeders belong to the so called "ossified group", a monophyletic lineage within the lamprologine cichlids. Since their distinctive life style enables these species to live and breed in closest vicinity, we hypothesized that these cichlids might be particularly prone to accidental hybridization, and that introgression might have affected the evolutionary history of this cichlid lineage. Our analyses revealed discrepancies between phylogenetic hypotheses based on mitochondrial and nuclear (AFLP) data. While the nuclear phylogeny was congruent with morphological, behavioral and ecological characteristics, several species – usually highly specialized shell-breeders – were placed at contradicting positions in the mitochondrial phylogeny. The discordant phylogenies strongly suggest repeated incidents of introgressive hybridization between several distantly related shell-breeding species, which reticulated the phylogeny of this group of cichlids. Long interior branches and high bootstrap support for many interior nodes in the mitochondrial phylogeny argue against a major effect of ancient incomplete lineage sorting on the phylogenetic reconstruction. Moreover, we provide morphological and genetic (mtDNA and microsatellites) evidence for ongoing hybridization among distantly related shell-breeders. In these cases, the territorial males of the inferred paternal species are too large to enter the shells of their mate, such that they have to release their sperm over the entrance of the shell to fertilize the eggs. With sperm dispersal by water currents and wave action, trans-specific fertilization of clutches in neighboring shells seem inevitable, when post-zygotic isolation is incomplete. From the direct observation of hybrids we conclude that hybridization between distantly related gastropod-shell-breeding cichlids of Lake Tanganyika follows inevitably from their ecological specialization. Moreover, the observed incongruence between mtDNA and nuclear multilocus phylogeny suggests that repeated hybridization events among quite distantly related taxa affected the diversification of this group, and introduced reticulation into their phylogeny.

Journal ArticleDOI
TL;DR: A dated molecular supertree for all 34 world pinniped species derived from a weighted matrix representation with parsimony (MRP) supertree analysis of 50 gene trees, each determined under a maximum likelihood (ML) framework is presented.
Abstract: Phylogenetic comparative methods are often improved by complete phylogenies with meaningful branch lengths (e.g., divergence dates). This study presents a dated molecular supertree for all 34 world pinniped species derived from a weighted matrix representation with parsimony (MRP) supertree analysis of 50 gene trees, each determined under a maximum likelihood (ML) framework. Divergence times were determined by mapping the same sequence data (plus two additional genes) on to the supertree topology and calibrating the ML branch lengths against a range of fossil calibrations. We assessed the sensitivity of our supertree topology in two ways: 1) a second supertree with all mtDNA genes combined into a single source tree, and 2) likelihood-based supermatrix analyses. Divergence dates were also calculated using a Bayesian relaxed molecular clock with rate autocorrelation to test the sensitivity of our supertree results further. The resulting phylogenies all agreed broadly with recent molecular studies, in particular supporting the monophyly of Phocidae, Otariidae, and the two phocid subfamilies, as well as an Odobenidae + Otariidae sister relationship; areas of disagreement were limited to four more poorly supported regions. Neither the supertree nor supermatrix analyses supported the monophyly of the two traditional otariid subfamilies, supporting suggestions for the need for taxonomic revision in this group. Phocid relationships were similar to other recent studies and deeper branches were generally well-resolved. Halichoerus grypus was nested within a paraphyletic Pusa, although relationships within Phocina tend to be poorly supported. Divergence date estimates for the supertree were in good agreement with other studies and the available fossil record; however, the Bayesian relaxed molecular clock divergence date estimates were significantly older. Our results join other recent studies and highlight the need for a re-evaluation of pinniped taxonomy, especially as regards the subfamilial classification of otariids and the generic nomenclature of Phocina. Even with the recent publication of new sequence data, the available genetic sequence information for several species, particularly those in Arctocephalus, remains very limited, especially for nuclear markers. However, resolution of parts of the tree will probably remain difficult, even with additional data, due to apparent rapid radiations. Our study addresses the lack of a recent pinniped phylogeny that includes all species and robust divergence dates for all nodes, and will therefore prove indispensable to comparative and macroevolutionary studies of this group of carnivores.

Journal ArticleDOI
TL;DR: The results demonstrate that despite its conservative nature, Rubisco evolves under positive selection in most lineages of land plants, and after billions of years of evolution Darwinian selection still fine-tunes its performance.
Abstract: Rubisco enzyme catalyzes the first step in net photosynthetic CO2 assimilation and photorespiratory carbon oxidation and is responsible for almost all carbon fixation on Earth. The large subunit of Rubisco is encoded by the chloroplast rbcL gene, which is widely used for reconstruction of plant phylogenies due to its conservative nature. Plant systematicists have mainly used rbcL paying little attention to its function, and the question whether it evolves under Darwinian selection has received little attention. The purpose of our study was to evaluate how common is positive selection in Rubisco among the phototrophs and where in the Rubisco structure does positive selection occur. We searched for positive selection in rbcL sequences from over 3000 species representing all lineages of green plants and some lineages of other phototrophs, such as brown and red algae, diatoms, euglenids and cyanobacteria. Our molecular phylogenetic analysis found the presence of positive selection in rbcL of most analyzed land plants, but not in algae and cyanobacteria. The mapping of the positively selected residues on the Rubisco tertiary structure revealed that they are located in regions important for dimer-dimer, intradimer, large subunit-small subunit and Rubisco-Rubisco activase interactions, and that some of the positively selected residues are close to the active site. Our results demonstrate that despite its conservative nature, Rubisco evolves under positive selection in most lineages of land plants, and after billions of years of evolution Darwinian selection still fine-tunes its performance. Widespread positive selection in rbcL has to be taken into account when this gene is used for phylogenetic reconstructions.