scispace - formally typeset
Search or ask a question

Showing papers on "Phylogenetic tree published in 2003"


Journal ArticleDOI
TL;DR: Analysis of variance of log K for all 121 traits indicated that behavioral traits exhibit lower signal than body size, morphological, life-history, or physiological traits, and this work presents new methods for continuous-valued characters that can be implemented with either phylogenetically independent contrasts or generalized least-squares models.
Abstract: The primary rationale for the use of phylogenetically based statistical methods is that phylogenetic signal, the tendency for related species to resemble each other, is ubiquitous. Whether this assertion is true for a given trait in a given lineage is an empirical question, but general tools for detecting and quantifying phylogenetic signal are inadequately developed. We present new methods for continuous-valued characters that can be implemented with either phylogenetically independent contrasts or generalized least-squares models. First, a simple randomization procedure allows one to test the null hypothesis of no pattern of similarity among relatives. The test demonstrates correct Type I error rate at a nominal α = 0.05 and good power (0.8) for simulated datasets with 20 or more species. Second, we derive a descriptive statistic, K, which allows valid comparisons of the amount of phylogenetic signal across traits and trees. Third, we provide two biologically motivated branch-length transformat...

3,896 citations


Journal ArticleDOI
TL;DR: Despite the near-universal usage of ITS sequence data in plant phylogenetic studies, its complex and unpredictable evolutionary behavior reduce its utility for phylogenetic analysis, and it is suggested that more robust insights are likely to emerge from the use of single-copy or low-copy nuclear genes.

1,718 citations


Journal ArticleDOI
23 Oct 2003-Nature
TL;DR: The results suggest that data sets consisting of single or a small number of concatenated genes have a significant probability of supporting conflicting topologies, and have important implications for resolving branches of the tree of life.
Abstract: One of the most pervasive challenges in molecular phylogenetics is the incongruence between phylogenies obtained using different data sets, such as individual genes. To systematically investigate the degree of incongruence, and potential methods for resolving it, we screened the genome sequences of eight yeast species and selected 106 widely distributed orthologous genes for phylogenetic analyses, singly and by concatenation. Our results suggest that data sets consisting of single or a small number of concatenated genes have a significant probability of supporting conflicting topologies. By contrast, analyses of the entire data set of concatenated genes yielded a single, fully resolved species tree with maximum support. Comparable results were obtained with a concatenation of a minimum of 20 genes; substantially more genes than commonly used but a small fraction of any genome. These results have important implications for resolving branches of the tree of life.

1,490 citations


Journal ArticleDOI
27 Mar 2003-Nature
TL;DR: It is shown that a genome-wide duplication post-dates the divergence of Arabidopsis from most dicots, and that additional, more ancient duplication events affect more distant taxonomic comparisons.
Abstract: Conservation of gene order in vertebrates is evident after hundreds of millions of years of divergence, but comparisons of the Arabidopsis thaliana sequence to partial gene orders of other angiosperms (flowering plants) sharing common ancestry approximately 170-235 million years ago yield conflicting results. This difference may be largely due to the propensity of angiosperms to undergo chromosomal duplication ('polyploidization') and subsequent gene loss ('diploidization'); these evolutionary mechanisms have profound consequences for comparative biology. Here we integrate a phylogenetic approach (relating chromosomal duplications to the tree of life) with a genomic approach (mitigating information lost to diploidization) to show that a genome-wide duplication post-dates the divergence of Arabidopsis from most dicots. We also show that an inferred ancestral gene order for Arabidopsis reveals more synteny with other dicots (exemplified by cotton), and that additional, more ancient duplication events affect more distant taxonomic comparisons. By using partial sequence data for many diverse taxa to better relate the evolutionary history of completely sequenced genomes to the tree of life, we foster comparative approaches to the study of genome organization, consequences of polyploidy, and the molecular basis of quantitative traits.

1,420 citations


Journal ArticleDOI
TL;DR: The distribution of secondary metabolite profiles mean that the systematic value of chemical characters becomes a matter of interpretation in the same way as traditional morphological markers and their occurrence apparently reflects adaptations and particular life strategies embedded in a given phylogenetic framework.

1,164 citations


Journal ArticleDOI
TL;DR: The utility of the method described by Nielsen to the mapping of morphological characters under continuous-time Markov models for mapping characters on trees and for identifying character correlation is demonstrated.
Abstract: Many questions in evolutionary biology are best addressed by comparing traits in different species Often such studies involve mapping characters on phylogenetic trees Mapping characters on trees allows the nature, number, and timing of the transformations to be identified The parsimony method is the only method available for mapping morphological characters on phylogenies Although the parsimony method often makes reasonable reconstructions of the history of a character, it has a number of limitations These limitations include the inability to consider more than a single change along a branch on a tree and the uncoupling of evolutionary time from amount of character change We extended a method described by Nielsen (2002, Syst Biol 51:729-739) to the mapping of morphological characters under continuous-time Markov models and demonstrate here the utility of the method for mapping characters on trees and for identifying character correlation (Bayesian estimation; character correlation; character mapping; Markov chain Monte Carlo) The footprint of natural selection on organisms can of- ten be detected using phylogenetic methods Correlation in either molecular or morphological characters is taken as evidence of natural selection acting on those charac- ters (Harvey and Pagel, 1991) The correlation might be between a character and the environment, with the re- peated evolution of the character in a particular environ- ment indicating that the trait confers an advantage, or the correlation may be between one character and another In ribosomal RNA sequences, for example, correlated changes occur in nucleotides paired in the stem struc- tures; natural selection is acting to maintain Watson- Crick pairing of nucleotides in the functionally impor- tant stem structures In either case-correlation between different characters or the repeated evolution of a charac- ter in a particular environment-phylogenetic methods provide the best framework for the analysis of correlation because they allow the effects of a common phylogenetic history that simultaneously acts on all of the characters to be partitioned from the evolutionary processes gener- ating the character patterns (Felsenstein, 1985) Despite the importance of phylogenetic analysis of character change in evolutionary biology, detection of correlation in characters is fraught with difficulties One dilemma involves how characters should be mapped onto a phylogenetic tree Many methods for detecting correlations rely on mapping character changes on a phylogenetic tree using the parsimony method (Ridley, 1983; Maddison, 1990) The parsimony method provides the minimum number of transformations required to explain the evolution of the character on the tree and therefore necessarily underestimates the total number of changes Furthermore, some methods treat the par- simony mapping of a character as an observation in fur- ther statistical analyses (Ridley, 1983; Maddison, 1990) Although the parsimony method is expected to provide a reasonable mapping of a character when the rates of evolution are low, the fundamental problem with the method is that it does not account for the uncertainty in the process of character change In effect, the parsimony method wagers all on the mapping requiring the fewest changes, when in reality many other perhaps slightly less parsimonious mappings may be nearly as good or

775 citations


Journal ArticleDOI
TL;DR: Many unexpected, but highly supported relationships were found within the Percomorpha, being highly promising for the next investigative step towards resolution of this remarkably diversified group of teleosts.

704 citations


Journal ArticleDOI
15 Aug 2003-Science
TL;DR: A comparative analytical framework for examining phylogenetic patterns of diversification and morphological disparity with data from four iguanian-lizard taxa that exhibit substantially different patterns of evolution is presented.
Abstract: Identification of general properties of evolutionary radiations has been hindered by the lack of a general statistical and phylogenetic approach applicable across diverse taxa. We present a comparative analytical framework for examining phylogenetic patterns of diversification and morphological disparity with data from four iguanian-lizard taxa that exhibit substantially different patterns of evolution. Taxa whose diversification occurred disproportionately early in their evolutionary history partition more of their morphological disparity among, rather than within, subclades. This inverse relationship between timing of diversification and morphological disparity within subclades may be a general feature that transcends the historically contingent properties of different evolutionary radiations.

622 citations


Journal ArticleDOI
TL;DR: In this study, simulations are used to show that the reduced accuracy associated with including incomplete taxa is caused by these taxa bearing too few complete characters rather than too many missing data cells, and suggest a more effective strategy for dealing with incompleteTaxa.
Abstract: The problem of missing data is often considered to be the most important obstacle in reconstructing the phylogeny of fossil taxa and in combining data from diverse characters and taxa for phylogenetic analysis. Empirical and theoretical studies show that including highly incomplete taxa can lead to multiple equally parsimonious trees, poorly resolved consensus trees, and decreased phylogenetic accuracy. However, the mechanisms that cause incomplete taxa to be problematic have remained unclear. It has been widely assumed that incomplete taxa are problematic because of the proportion or amount of missing data that they bear. In this study, I use simulations to show that the reduced accuracy associated with including incomplete taxa is caused by these taxa bearing too few complete characters rather than too many missing data cells. This seemingly subtle distinction has a number of important implications. First, the so-called missing data problem for incomplete taxa is, paradoxically, not directly related to their amount or proportion of missing data. Thus, the level of completeness alone should not guide the exclusion of taxa (contrary to common practice), and these results may explain why empirical studies have sometimes found little relationship between the completeness of a taxon and its impact on an analysis. These results also (1) suggest a more effective strategy for dealing with incomplete taxa, (2) call into question a justification of the controversial phylogenetic supertree approach, and (3) show the potential for the accurate phylogenetic placement of highly incomplete taxa, both when combining diverse data sets and when analyzing relationships of fossil taxa.

609 citations


Journal ArticleDOI
TL;DR: The value of applying ITS2 RNA transcript secondary structure information to improve alignments is described, which then allows comparisons at even deeper taxonomic levels.

508 citations


Journal ArticleDOI
TL;DR: The analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT, thus establishing a foundation for reconstructing the evolutionary transitions that underlie diversity in genome content and organization.
Abstract: The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.

Journal ArticleDOI
TL;DR: Comparative analysis of the complete genome sequences of 13 baculoviruses revealed a core set of 30 genes, 20 of which have known functions, which provided evidence for two putative DNA repair systems and for viral proteins specific for infection of lymantrid hosts.
Abstract: Comparative analysis of the complete genome sequences of 13 baculoviruses revealed a core set of 30 genes, 20 of which have known functions. Phylogenetic analyses of these 30 genes yielded a tree with 4 major groups: the genus Granulovirus (GVs), the group I and II lepidopteran nucleopolyhedroviruses (NPVs), and the dipteran NPV, CuniNPV. These major divisions within the family Baculoviridae were also supported by phylogenies based on gene content and gene order. Gene content mapping has revealed the patterns of gene acquisitions and losses that have taken place during baculovirus evolution, and it has highlighted the fluid nature of baculovirus genomes. The identification of shared protein phylogenetic profiles provided evidence for two putative DNA repair systems and for viral proteins specific for infection of lymantrid hosts. Examination of gene order conservation revealed a core gene cluster of four genes, helicase, lef-5, ac96, and 38K(ac98), whose relative positions are conserved in all baculovirus genomes.

Journal ArticleDOI
TL;DR: Not only were phylogenetic criteria superior to traditional reproductive compatibility criteria in revealing the full species diversity of Neurospora, but also significant phylogenetic subdivisions were detected within some species.
Abstract: To critically examine the relationship between species recognized by phylogenetic and reproductive compatibility criteria, we applied phylogenetic species recognition (PSR) to the fungus in which biological species recognition (BSR) has been most comprehensively applied, the well-studied genus Neurospora. Four independent anonymous nuclear loci were characterized and sequenced from 147 individuals that were representative of all described outbreeding species of Neurospora. We developed a consensus-tree approach that identified monophyletic genealogical groups that were concordantly supported by the majority of the loci, or were well supported by at least one locus but not contradicted by any other locus. We recognized a total of eight phylogenetic species, five of which corresponded with the five traditional biological species, and three of which were newly discovered. Not only were phylogenetic criteria superior to traditional reproductive compatibility criteria in revealing the full species diversity of Neurospora, but also significant phylogenetic subdivisions were detected within some species. Despite previous suggestions of hybridization between N. crassa and N. intermedia in nature, and the fact that several putative hybrid individuals were included in this study, no molecular evidence in support of recent interspecific gene flow or the existence of true hybrids was observed. The sequence data from the four loci were combined and used to clarify how the species discovered by PSR were related. Although species-level clades were strongly supported, the phylogenetic relationships among species remained difficult to resolve, perhaps due to conflicting signals resulting from differential lineage sorting.

Journal ArticleDOI
TL;DR: It is advisable to concatenate many gene sequences and use a multigene gamma distance for estimating divergence times rather than using the individual gene approach, and nuclear proteins are generally more suitable than mitochondrial proteins for time estimation.
Abstract: Although the phylogenetic relationships of major lineages of primate species are relatively well established, the times of divergence of these lineages as estimated by molecular data are still controversial. This controversy has been generated in part because different authors have used different types of molecular data, different statistical methods, and different calibration points. We have therefore examined the effects of these factors on the estimates of divergence times and reached the following conclusions: (1) It is advisable to concatenate many gene sequences and use a multigene gamma distance for estimating divergence times rather than using the individual gene approach. (2) When sequence data from many nuclear genes are available, protein sequences appear to give more robust estimates than DNA sequences. (3) Nuclear proteins are generally more suitable than mitochondrial proteins for time estimation. (4) It is important first to construct a phylogenetic tree for a group of species using some outgroups and then estimate the branch lengths. (5) It appears to be better to use a few reliable calibration points rather than many unreliable ones. Considering all these factors and using two calibration points, we estimated that the human lineage diverged from the chimpanzee, gorilla, orangutan, Old World monkey, and New World monkey lineages approximately 6 MYA (with a range of 5-7), 7 MYA (range, 6-8), 13 MYA (range, 12-15), 23 MYA (range, 21-25), and 33 MYA (range 32-36).

Journal ArticleDOI
01 Oct 2003-Genetics
TL;DR: The authors' phylogenetic analyses show two gene clades within the core eudicots, euAP1 (including Arabidopsis APETALA1 and Antirrhinum SQUAMOSA) and euFUL (includingArabidopsis FRUITFULL), which includes key regulators of floral development that have been implicated in the specification of perianth identity.
Abstract: Phylogenetic analyses of angiosperm MADS-box genes suggest that this gene family has undergone multiple duplication events followed by sequence divergence. To determine when such events have taken place and to understand the relationships of particular MADS-box gene lineages, we have identified APETALA1/FRUITFULL-like MADS-box genes from a variety of angiosperm species. Our phylogenetic analyses show two gene clades within the core eudicots, euAP1 (including Arabidopsis APETALA1 and Antirrhinum SQUAMOSA) and euFUL (including Arabidopsis FRUITFULL). Non-core eudicot species have only sequences similar to euFUL genes (FUL-like). The predicted protein products of euFUL and FUL-like genes share a conserved C-terminal motif. In contrast, predicted products of members of the euAP1 gene clade contain a different C terminus that includes an acidic transcription activation domain and a farnesylation signal. Sequence analyses indicate that the euAP1 amino acid motifs may have arisen via a translational frameshift from the euFUL/FUL-like motif. The euAP1 gene clade includes key regulators of floral development that have been implicated in the specification of perianth identity. However, the presence of euAP1 genes only in core eudicots suggests that there may have been changes in mechanisms of floral development that are correlated with the fixation of floral structure seen in this clade.

Journal ArticleDOI
TL;DR: The possible evolutionary routes to parthenogenesis are reviewed based on a survey of the phylogenetic relationships between sexual and parthenogenetic lineages in a broad range of animals and the influences of these mechanisms on both the genetic properties and the ecological life styles of the resulting lineages are discussed.
Abstract: In theory, parthenogenetic lineages have low evolutionary potential because they inexorably accumulate deleterious mutations and do not generate much genotypic diversity. As a result, most parthenogenetic taxa occupy the terminal nodes of phylogenetic trees. The rate and mode of development of parthenogenesis are important factors to consider when assessing its costs and benefits since they determine both the level of genetic diversity and the ecological adaptability of the resulting lineages. The origin of parthenogenesis is polyphyletic in many taxa, suggesting that genetic systems maintaining sexuality are often labile. In addition, the loss of sex may be achieved in several ways, leading to parthenogenetic lineages with distinct genetic profiles. This could then influence not only the fate of such lineages in the long term, but also the outcome of competition with their sexual counterparts in the short term. In this paper, we review the possible evolutionary routes to parthenogenesis based on a survey of the phylogenetic relationships between sexual and parthenogenetic lineages in a broad range of animals. We also examine the different mechanisms by which parthenogenetic lineages could arise, and discuss the influences of these mechanisms on both the genetic properties and the ecological life styles of the resulting lineages.

Journal ArticleDOI
TL;DR: Findings suggest that beta-rhizobia evolved from diazotrophs through multiple lateral nod gene transfers, strongly supporting the hypothesis of the unique origin of common nod genes.
Abstract: Following the initial discovery of two legume-nodulating Burkholderia strains (L. Moulin, A. Munive, B. Dreyfus, and C. Boivin-Masson, Nature 411:948–950, 2001), we identified as nitrogen-fixing legume symbionts at least 50 different strains of Burkholderia caribensis and Ralstonia taiwanensis, all belonging to the-subclass of proteobacteria, thus extending the phylogenetic diversity of the rhizobia. R. taiwanensis was found to represent 93% of the Mimosa isolates in Taiwan, indicating that-proteobacteria can be the specific symbionts of a legume. The nod genes of rhizobial-proteobacteria (-rhizobia) are very similar to those of rhizobia from the-subclass (-rhizobia), strongly supporting the hypothesis of the unique origin of common nod genes. The-rhizobial nod genes are located on a 0.5-Mb plasmid, together with the nifH gene, in R. taiwanensis and Burkholderia phymatum. Phylogenetic analysis of available nodA gene sequences clustered-rhizobial sequences in two nodA lineages intertwined with-rhizobial sequences. On the other hand, the-rhizobia were grouped with free-living nitrogen-fixing-proteobacteria on the basis of the nifH phylogenetic tree. These findings suggest that-rhizobia evolved from diazotrophs through multiple lateral nod gene transfers

Journal ArticleDOI
TL;DR: The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl.
Abstract: Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life.

Journal ArticleDOI
TL;DR: A new sequence distance measure based on the relative information between the sequences using Lempel-Ziv complexity is proposed, which can be used to construct phylogenetic trees.
Abstract: Motivation Most existing approaches for phylogenetic inference use multiple alignment of sequences and assume some sort of an evolutionary model. The multiple alignment strategy does not work for all types of data, e.g. whole genome phylogeny, and the evolutionary models may not always be correct. We propose a new sequence distance measure based on the relative information between the sequences using Lempel-Ziv complexity. The distance matrix thus obtained can be used to construct phylogenetic trees. Results The proposed approach does not require sequence alignment and is totally automatic. The algorithm has successfully constructed consistent phylogenies for real and simulated data sets. Availability Available on request from the authors.

Journal ArticleDOI
TL;DR: It is concluded that by using this dataset it will be possible to type future virus isolates rapidly on the basis of their nucleotide sequence and make inferences about theirorigins.
Abstract: A sequence 375 nucleotides in length, which included the region encoding the cleavage activation site and signal peptide of the fusion protein gene, was determined for 174 isolates of Newcastle disease virus (avian paramyxovirus type 1). These were compared with the sequences of 164 isolates published on GenBank, and the resulting alignment was analysed phylogenetically using maximum likelihood. The results are presented as unrooted phylogenetic trees. Briefly, the isolates divided into six broadly distinct groups (lineages 1 to 6). Lineages 3 and 4 were further subdivided into four sublineages (a to d) and lineage 5 into five lineages (a to e). Considerable genetic heterogeneity was detected within avian paramyxoviruses type 1, which appears to be influenced by host, time and geographical origin. It is concluded that by using this dataset it will be possible to type future virus isolates rapidly on the basis of their nucleotide sequence and make inferences about their origins.

Proceedings ArticleDOI
01 Jul 2003
TL;DR: The idea of "guaranteed visibility", where highlighted areas are treated as landmarks that must remain visually apparent at all times, is introduced in TreeJuxtaposer, a system designed to support the comparison task for large trees of several hundred thousand nodes.
Abstract: Structural comparison of large trees is a difficult task that is only partially supported by current visualization techniques, which are mainly designed for browsing. We present TreeJuxtaposer, a system designed to support the comparison task for large trees of several hundred thousand nodes. We introduce the idea of "guaranteed visibility", where highlighted areas are treated as landmarks that must remain visually apparent at all times. We propose a new methodology for detailed structural comparison between two trees and provide a new nearly-linear algorithm for computing the best corresponding node from one tree to another. In addition, we present a new rectilinear Focus+Context technique for navigation that is well suited to the dynamic linking of side-by-side views while guaranteeing landmark visibility and constant frame rates. These three contributions result in a system delivering a fluid exploration experience that scales both in the size of the dataset and the number of pixels in the display. We have based the design decisions for our system on the needs of a target audience of biologists who must understand the structural details of many phylogenetic, or evolutionary, trees. Our tool is also useful in many other application domains where tree comparison is needed, ranging from network management to call graph optimization to genealogy.

Journal ArticleDOI
TL;DR: This unit provides a general description of reconstructing evolutionary trees using PAUP* 4.0 using an example analysis of mitochondrial DNA sequence data using the parsimony and the likelihood criteria to infer optimal trees.
Abstract: This unit provides a general description of reconstructing evolutionary trees using PAUP* 4.0. The protocol takes users through an example analysis of mitochondrial DNA sequence data using the parsimony and the likelihood criteria to infer optimal trees. The protocol also discusses searching options available in PAUP* and demonstrates how to import non-NEXUS formats. Finally, a general discussion is given regarding the pros and cons of the "model-free" and "model-based" methods used throughout the protocol.

Journal ArticleDOI
TL;DR: Isolates of Cryptosporidium from the Czech Republic were characterized from a variety of different hosts using sequence and phylogenetic analysis of the 18S ribosomal DNA and the heat-shock (HSP-70) gene.
Abstract: Isolates of Cryptosporidium from the Czech Republic were characterized from a variety of different hosts using sequence and phylogenetic analysis of the 18S ribosomal DNA and the heat-shock (HSP-70) gene. Analysis expanded the host range of accepted species and identified several novel genotypes, including horse, Eurasian woodcock, rabbit, and cervid genotypes.

Journal ArticleDOI
TL;DR: The cumulative results of phylogenetic reconstructions suggest that the alkene/aromatic Monooxygenases diverged first from the last common ancestor for these enzymes, followed by the phenol hydroxylases, Amo alkene monooxygenase, and methane mono oxygengenases.
Abstract: Based on structural, biochemical, and genetic data, the soluble diiron monooxygenases can be divided into four groups: the soluble methane monooxygenases, the Amo alkene monooxygenase of Rhodococcus corallinus B-276, the phenol hydroxylases, and the four-component alkene/aromatic monooxygenases. The limited phylogenetic distribution of these enzymes among bacteria, together with available genetic evidence, indicates that they have been spread largely through horizontal gene transfer. Phylogenetic analyses reveal that the α- and β-oxygenase subunits are paralogous proteins and were derived from an ancient gene duplication of a carboxylate-bridged diiron protein, with subsequent divergence yielding a catalytic α-oxygenase subunit and a structural β-oxygenase subunit. The oxidoreductase and ferredoxin components of these enzymes are likely to have been acquired by horizontal transfer from ancestors common to unrelated diiron and Rieske center oxygenases and other enzymes. The cumulative results of phylogenetic reconstructions suggest that the alkene/aromatic monooxygenases diverged first from the last common ancestor for these enzymes, followed by the phenol hydroxylases, Amo alkene monooxygenase, and methane monooxygenases.

Journal ArticleDOI
TL;DR: The phylogenetic relationship of 12 ammonia-oxidizing isolates (eight nitrosospiras and four nitrosomonads) was investigated and the estuarine isolate Nitrosomonas sp.
Abstract: The phylogenetic relationship of 12 ammonia-oxidizing isolates (eight nitrosospiras and four nitrosomonads), for which no gene sequence information was available previously, was investigated based on their genes encoding 16S rRNA and the active site subunit of ammonia monooxygenase (AmoA). Almost full-length 16S rRNA gene sequences were determined for the 12 isolates. In addition, 16S rRNA gene sequences of 15 ammonia-oxidizing bacteria (AOB) published previously were completed to allow for a more reliable phylogeny inference of members of this guild. Moreover, sequences of 453 bp fragments of the amoA gene were determined from 15 AOB, including the 12 isolates, and completed for 10 additional AOB. 16S rRNA gene and amoA-based analyses, including all available sequences of AOB pure cultures, were performed to determine the position of the newly retrieved sequences within the established phylogenetic framework. The resulting 16S rRNA gene and amoA tree topologies were similar but not identical and demonstrated a superior resolution of 16S rRNA versus amoA analysis. While 11 of the 12 isolates could be assigned to different phylogenetic groups recognized within the betaproteobacterial AOB, the estuarine isolate Nitrosomonas sp. Nm143 formed a separate lineage together with three other marine isolates whose 16S rRNA sequences have not been published but have been deposited in public databases. In addition, 17 environmentally retrieved 16S rRNA gene sequences not assigned previously and all originating exclusively from marine or estuarine sites clearly belong to this lineage.

Journal ArticleDOI
TL;DR: Evidence from approximately 200,000 nucleotides suggests that polyploidy in Gossypium led to a modest enhancement in rates of nucleotide substitution, suggesting an absence of gene conversion or recombination among homoeologs subsequent to allopolyploid formation.
Abstract: Molecular evolutionary rate variation in Gossypium (cotton) was characterized using sequence data for 48 nuclear genes from both genomes of allotetraploid cotton, models of its diploid progenitors, and an outgroup. Substitution rates varied widely among the 48 genes, with silent and replacement substitution levels varying from 0.018 to 0.162 and from 0.000 to 0.073, respectively, in comparisons between orthologous Gossypium and outgroup sequences. However, about 90% of the genes had silent substitution rates spanning a more narrow threefold range. Because there was no evidence of rate heterogeneity among lineages for any gene and because rates were highly correlated in independent tests, evolutionary rate is inferred to be a property of each gene or its genetic milieu rather than the clade to which it belongs. Evidence from approximately 200,000 nucleotides (40,000 per genome) suggests that polyploidy in Gossypium led to a modest enhancement in rates of nucleotide substitution. Phylogenetic analysis for each gene yielded the topology expected from organismal history, indicating an absence of gene conversion or recombination among homoeologs subsequent to allopolyploid formation. Using the mean synonymous substitution rate calculated across the 48 genes, allopolyploid cotton is estimated to have formed circa 1.5 million years ago (MYA), after divergence of the diploid progenitors about 6.7 MYA.

Journal ArticleDOI
TL;DR: A novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles is presented and it is demonstrated that PhyloCon performs well on both synthetic and biological data.
Abstract: Motivation: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a ‘multiple genes, single species’approach. It proposes that a degenerate motif is embedded in some or all of the otherwise unrelated input sequences and tries to describe a consensus motif and identify its occurrences. It is often used for co-regulated genes identified through experimental approaches. The second approach can be called ‘single gene, multiple species’. It requires orthologous input sequences and tries to identify unusually well conserved regions by phylogenetic footprinting. Both approaches perform well, but each has some limitations. It is tempting to combine the knowledge of co-regulation among different genes and conservation among orthologous genes to improve our ability to identify motifs. Results: Based on the Consensus algorithm previously established by our group, we introduce a new algorithm called PhyloCon (Phylogenetic Consensus) that takes into account both conservation among orthologous genes and co-regulation of genes within a species. This algorithm first aligns conserved regions of orthologous sequences into multiple sequence alignments, or profiles, then compares profiles representing non-orthologous sequences. Motifs emerge as common regions in these profiles. Here we present a novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles. We demonstrate that PhyloCon performs well on both synthetic and biological data. Availability: Software available upon request from the authors. http://ural.wustl.edu/softwares.html

Journal ArticleDOI
TL;DR: The phylogenetic relationships of all known species of the genus Aeromonas were investigated by using the sequence of gyrB, a gene that encodes the B-subunit of DNA gyrase, which proved to be an excellent molecular chronometer for phylogenetic studies.
Abstract: The phylogenetic relationships of all known species of the genus Aeromonas were investigated by using the sequence of gyrB, a gene that encodes the B-subunit of DNA gyrase. Nucleotide sequences of gyrB were determined from 53 Aeromonas strains, including some new isolates, which were also characterized by analysis of the 16S rDNA variable regions. The results support the recognition of the family Aeromonadaceae, as distinct from Plesiomonas shigelloides and other enteric bacteria. This phylogenetic marker revealed strain groupings that are consistent with the taxonomic organization of all Aeromonas species described to date. In particular, gyrB results agreed with 16S rDNA analysis; moreover, the former showed a higher capacity to differentiate between species. The present analysis was useful for the elucidation of reported discrepancies between different DNA-DNA hybridization sets. Additionally, due to the sequence diversity found at the intraspecies level, gyrB is proposed as a useful target for simultaneous identification of species and strains. In conclusion, the gyrB gene has proved to be an excellent molecular chronometer for phylogenetic studies of the genus Aeromonas.

Journal ArticleDOI
TL;DR: Mitogenomic data strongly supported not only the monophyly of the teleosts (osteoglossomorphs and above), but also a sister-group relationship between theteleosts and a clade comprising the acipenseriforms, lepisosteids, and Amia, with the polypteriforms occupying the most basal position in the actinopterygian phylogeny.

Journal ArticleDOI
TL;DR: Relying solely upon ITS nrDNA analysis to reveal phylogenetic patterns in a complex genus such as Nicotiana is insufficient, and it is clear that conventional analysis of single data sets, such as ITS, is likely to be misleading in at least some respects about evolutionary history.