scispace - formally typeset
Search or ask a question

Showing papers in "Molecular Biology and Evolution in 1999"


Journal ArticleDOI
TL;DR: A method for constructing networks from recombination-free population data that combines features of Kruskal's algorithm for finding minimum spanning trees by favoring short connections, and Farris's maximum-parsimony (MP) heuristic algorithm, which sequentially adds new vertices called "median vectors", except that the MJ method does not resolve ties.
Abstract: Reconstructing phylogenies from intraspecific data (such as human mitochondrial DNA variation) is often a challenging task because of large sample sizes and small genetic distances between individuals. The resulting multitude of plausible trees is best expressed by a network which displays alternative potential evolutionary paths in the form of cycles. We present a method ("median joining" [MJ]) for constructing networks from recombination-free population data that combines features of Kruskal's algorithm for finding minimum spanning trees by favoring short connections, and Farris's maximum-parsimony (MP) heuristic algorithm, which sequentially adds new vertices called "median vectors", except that our MJ method does not resolve ties. The MJ method is hence closely related to the earlier approach of Foulds, Hendy, and Penny for estimating MP trees but can be adjusted to the level of homoplasy by setting a parameter epsilon. Unlike our earlier reduced median (RM) network method, MJ is applicable to multistate characters (e.g., amino acid sequences). An additional feature is the speed of the implemented algorithm: a sample of 800 worldwide mtDNA hypervariable segment I sequences requires less than 3 h on a Pentium 120 PC. The MJ method is demonstrated on a Tibetan mitochondrial DNA RFLP data set.

9,937 citations


Journal ArticleDOI
TL;DR: A modification of the KH test to take into account a multiplicity of testings is presented, which shows how the test was designed for comparing two topologies but is often used for comparing many topologies.
Abstract: The maximum-likelihood method for inferring mo-lecular phylogeny (Felsenstein 1981) is being widelyused. The probabilistic model for generating the molec-ular sequences is specified by the substitution processand the tree topology. The parameters for the substitu-tion process and the branch lengths are estimated bymaximizing the likelihood, and then the tree topology isestimated by maximizing the maximized likelihood. Toobtain the confidence limit of the topology, the test ofKishino and Hasegawa (1989), referred to as the KHtest, is often used in practice. The same idea that is thebasis for the KH test is also found in the statistical lit-erature (Linhart 1988; Vuong 1989). The KH test wasdesigned for comparing two topologies but is often usedfor comparing many topologies. This use of the KH testleads to overconfidence for a wrong tree, because thesampling error due to the selection of the topology isoverlooked in it. In this note, we present a modificationof the KH test to take into account a multiplicity oftestings.Let a index the topologies and L

4,049 citations


Journal ArticleDOI
TL;DR: Findings show that a slowly evolving protein-coding gene such as RPB2 is useful for diagnosing phylogenetic relationships among fungi, and suggests that fruiting body formation and forcible discharge of ascospores were characters gained early in the evolution of the Ascomycota.
Abstract: In an effort to establish a suitable alternative to the widely used 18S rRNA system for molecular systematics of fungi, we examined the nuclear gene RPB2, encoding the second largest subunit of RNA polymerase II. Because RPB2 is a single-copy gene of large size with a modest rate of evolutionary change, it provides good phylogenetic resolution of Ascomycota. While the RPB2 and 18S rDNA phylogenies were highly congruent, the RPB2 phylogeny did result in much higher bootstrap support for all the deeper branches within the orders and for several branches between orders of the Ascomycota. There are several strongly supported phylogenetic conclusions. The Ascomycota is composed of three major lineages: Archiascomycetes, Saccharomycetales, and Euascomycetes. Within the Euascomycetes, plectomycetes, and pyrenomycetes are monophyletic groups, and the Pleosporales and Dothideales are distinct sister groups within the Loculoascomycetes. We confirm the placement of Neolecta within the Archiascomycetes, suggesting that fruiting body formation and forcible discharge of ascospores were characters gained early in the evolution of the Ascomycota. These findings show that a slowly evolving protein-coding gene such as RPB2 is useful for diagnosing phylogenetic relationships among fungi.

2,573 citations


Journal ArticleDOI
TL;DR: The Bayesian framework for analyzing aligned nucleotide sequence data to reconstruct phylogenies, assess uncertainty in the reconstructions, and perform other statistical inferences is developed and a Markov chain Monte Carlo sampler is employed to sample trees and model parameter values from their joint posterior distribution.
Abstract: We further develop the Bayesian framework for analyzing aligned nucleotide sequence data to reconstruct phylogenies, assess uncertainty in the reconstructions, and perform other statistical inferences. We employ a Markov chain Monte Carlo sampler to sample trees and model parameter values from their joint posterior distribution. All statistical inferences are naturally based on this sample. The sample provides a most-probable tree with posterior probabilities for each clade, information that is qualitatively similar to that for the maximum-likelihood tree with bootstrap proportions and permits further inferences on tree topology, branch lengths, and model parameter values. On moderately large trees, the computational advantage of our method over bootstrapping a maximum-likelihood analysis can be considerable. In an example with 31 taxa, the time expended by our software is orders of magnitude less than that a widely used phylogeny package for bootstrapping maximum likelihood estimation would require to achieve comparable statistical accuracy. While there has been substantial debate over the proper interpretation of bootstrap proportions, Bayesian posterior probabilities clearly and directly quantify uncertainty in questions of biological interest, at least from a Bayesian perspective. Because our tree proposal algorithms are independent of the choice of likelihood function, they could also be used in conjunction with likelihood models more complex than those we have currently implemented.

1,542 citations


Journal ArticleDOI
TL;DR: The finding of a recent common ancestor (probably in the last 120,000 years), coupled with a strong signal of demographic expansion in all populations, suggests either a recent human expansion from a small ancestral population, or natural selection acting on the Y chromosome.
Abstract: We use variation at a set of eight human Y chromosome microsatellite loci to investigate the demographic history of the Y chromosome. Instead of assuming a population of constant size, as in most of the previous work on the Y chromosome, we consider a model which permits a period of recent population growth. We show that for most of the populations in our sample this model fits the data far better than a model with no growth. We estimate the demographic parameters of this model for each population and also the time to the most recent common ancestor. Since there is some uncertainty about the details of the microsatellite mutation process, we consider several plausible mutation schemes and estimate the variance in mutation size simultaneously with the demographic parameters of interest. Our finding of a recent common ancestor (probably in the last 120,000 years), coupled with a strong signal of demographic expansion in all populations, suggests either a recent human expansion from a small ancestral population, or natural selection acting on the Y chromosome.

1,135 citations


Journal ArticleDOI
TL;DR: The phylogenies of the AP endonuclease and RNase H domains were determined and are consistent with the monophyletic acquisition of these domains and suggested that non-LTR elements are as old as eukaryotes, with each of the 11 clades dating back to the Precambrian era.
Abstract: A comprehensive phylogenetic analysis was conducted of non-long-terminal-repeat (non-LTR) retrotransposons based on an extended sequence alignment of their reverse transcriptase (RT) domain. The 440 amino acid positions used included a region proposed to be similar to the "thumb" of the right-handed RT structure found in retroviruses. All identified non-LTR elements could be grouped into 11 distinct clades. Using the rates of sequence change derived from studies of the vertical inheritance of R1 and R2 elements in arthropods as a comparison, we found no evidence for the horizontal transmission of non-LTR elements. Assuming vertical descent, the phylogeny suggested that non-LTR elements are as old as eukaryotes, with each of the 11 clades dating back to the Precambrian era. The analysis enabled us to propose a simple chronology for the acquisition of different enzymatic domains in the evolution of the non-LTR class of retrotransposons. The first non-LTR elements were sequence specific by virtue of a restriction-enzyme-like endonuclease located downstream of the RT domain. Evolving from this original group were elements (eight clades) that acquired an apurinic-apyrimidic endonuclease-like domain upstream of the RT domain. Finally, four of these clades have inherited an RNase H domain downstream of the RT domain. The phylogenies of the AP endonuclease and RNase H domains were also determined for this report and are consistent with the monophyletic acquisition of these domains. These studies represent the most comprehensive effort to date to trace the evolution of a major class of transposable elements.

547 citations


Journal ArticleDOI
TL;DR: Computer simulation showed that this method accurately estimated the numbers of synonymous and nonsynonymous substitutions per site, as long as the substitution number on each branch was relatively small, and the false-positive rate for detecting the selective force was generally low.
Abstract: A method was developed for detecting the selective force at single amino acid sites given a multiple alignment of protein-coding sequences. The phylogenetic tree was reconstructed using the number of synonymous substitutions. Then, the neutrality was tested for each codon site using the numbers of synonymous and nonsynonymous changes throughout the phylogenetic tree. Computer simulation showed that this method accurately estimated the numbers of synonymous and nonsynonymous substitutions per site, as long as the substitution number on each branch was relatively small. The false-positive rate for detecting the selective force was generally low. On the other hand, the true-positive rate for detecting the selective force depended on the parameter values. Within the range of parameter values used in the simulation, the true-positive rate increased as the strength of the selective force and the total branch length (namely the total number of synonymous substitutions per site) in the phylogenetic tree increased. In particular, with the relative rate of nonsynonymous substitutions to synonymous substitutions being 5.0, most of the positively selected codon sites were correctly detected when the total branch length in the phylogenetic tree was > or = 2.5. When this method was applied to the human leukocyte antigen (HLA) gene, which included antigen recognition sites (ARSs), positive selection was detected mainly on ARSs. This finding confirmed the effectiveness of the present method with actual data. Moreover, two amino acid sites were newly identified as positively selected in non-ARSs. The three-dimensional structure of the HLA molecule indicated that these sites might be involved in antigen recognition. Positively selected amino acid sites were also identified in the envelope protein of human immunodeficiency virus and the influenza virus hemagglutinin protein. This method may be helpful for predicting functions of amino acid sites in proteins, especially in the present situation, in which sequence data are accumulating at an enormous speed.

494 citations


Journal ArticleDOI
Xun Gu1
TL;DR: A site-specific profile based on the hidden Markov model is developed to identify critical amino acid residues that are responsible for these functional differences between two gene clusters, which may have great potential in functional genomics.
Abstract: Functional innovations after gene duplication may result in altered functional constraints between member gene clusters of a gene family. This type (type I) of functional divergence is measured by the coefficient of functional divergence (theta lambda), which can be interpreted as the decrease in rate correlation between gene clusters, or the probability that the evolutionary rate at a site is statistically independent between two gene clusters. A simple stochastic model has been developed for estimating theta lambda and testing its statistical significance. The current model includes the model of rate variation among sites as a special case when theta lambda = 0. Moreover, we have developed a site-specific profile based on the hidden Markov model to identify critical amino acid residues that are responsible for these functional differences between two gene clusters, which may have great potential in functional genomics.

438 citations


Journal ArticleDOI
TL;DR: CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods.
Abstract: Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).

411 citations


Journal ArticleDOI
TL;DR: It is demonstrated that hymenopteran parasitoids of frugivorous Drosophila species are especially susceptible to Wolbachia infection, which strongly supports the hypothesis of frequent natural Wolbachian transfers into other species and opens a new field for genetic exchanges among species, especially in host-parasitoid associations.
Abstract: Endosymbiotic Wolbachia infect a number of arthropod species in which they can affect the reproductive system. While maternally transmitted, unlike mitochondria their molecular phylogeny does not parallel that of their hosts. This strongly suggests horizontal transmission among species, the mechanisms of which remain unknown. Such transfers require intimate between-species relationships, and thus host-parasite associations are outstandingly appropriate for study. Here, we demonstrate that hymenopteran parasitoids of frugivorous Drosophila species are especially susceptible to Wolbachia infection. Of the five common European species, four proved to be infected; furthermore, multiple infections are common, with one species being doubly infected and two triply infected (first report). Phylogenetic statuses of the Wolbachia infecting the different species of the community have been studied using the gene wsp, a highly variable gene recently described. This study reveals exciting similarities between the Wolbachia variants found in parasitoids and their hosts. These arguments strongly support the hypothesis of frequent natural Wolbachia transfers into other species and open a new field for genetic exchanges among species, especially in host-parasitoid associations.

410 citations


Journal ArticleDOI
TL;DR: The breakpoints were found to be in similar positions, within the fusion peptide of the envelope protein, demonstrating that a single recombination event occurred prior to the divergence of these two strains, the first report of recombination in natural populations of dengue virus.
Abstract: A split decomposition analysis of dengue (DEN) virus gene sequences revealed extensive networked evolution, indicative of recombination, among DEN-1 strains but not within serotypes DEN-2, DEN-3, or DEN-4. Within DEN-1, two viruses sampled from South America in the last 10 years were identified as recombinants. To map the breakpoints and test their statistical support, we developed a novel maximum likelihood method. In both recombinants, the breakpoints were found to be in similar positions, within the fusion peptide of the envelope protein, demonstrating that a single recombination event occurred prior to the divergence of these two strains. This is the first report of recombination in natural populations of dengue virus.

Journal ArticleDOI
TL;DR: Using CGR, it is observed that subsequences of a genome exhibit the main characteristics of the whole genome, attesting to the validity of the genomic signature concept.
Abstract: We explored DNA structures of genomes by means of a new tool derived from the "chaotic dynamical systems" theory (the so-called chaos game representation [CGR]), which allows the depiction of frequencies of oligonucleotides in the form of images. Using CGR, we observe that subsequences of a genome exhibit the main characteristics of the whole genome, attesting to the validity of the genomic signature concept. Base concentrations, stretches (runs of complementary bases or purines/pyrimidines), and patches (over- or underexpressed words of various lengths) are the main factors explaining the variability observed among sequences. The distance between images may be considered a measure of phylogenetic proximity. Eukaryotes and prokaryotes can be identified merely on the basis of their DNA structures.

Journal ArticleDOI
TL;DR: The hemagglutinin (HA) gene of influenza viruses encodes the major surface antigen against which neutralizing antibodies are produced during infection or vaccination, and temporal variation in the HA1 domain of HA genes of human influenza A (H3N2) viruses was examined to identify positively selected codons.
Abstract: The hemagglutinin (HA) gene of influenza viruses encodes the major surface antigen against which neutralizing antibodies are produced during infection or vaccination. We examined temporal variation in the HA1 domain of HA genes of human influenza A (H3N2) viruses in order to identify positively selected codons. Positive selection is defined for our purposes as a significant excess of nonsilent over silent nucleotide substitutions. If past mutations at positively selected codons conferred a selective advantage on the virus, then additional changes at these positions may predict which emerging strains will predominate and cause epidemics. We previously reported that a 38% excess of mutations occurred on the tip or terminal branches of the phylogenetic tree of 254 HA genes of influenza A (H3N2) viruses. Possible explanations for this excess include processes other than viral evolution during replication in human hosts. Of particular concern are mutations that occur during adaptation of viruses for growth in embryonated chicken eggs in the laboratory. Because the present study includes 357 HA sequences (a 40% increase), we were able to separately analyze those mutations assigned to internal branches. This allowed us to determine whether mutations on terminal and internal branches exhibit different patterns of selection at the level of individual codons. Additional improvements over our previous analysis include correction for a skew in the distribution of amino acid replacements across codons and analysis of a population of phylogenetic trees rather than a single tree. The latter improvement allowed us to ascertain whether minor variation in tree structure had a significant effect on our estimate of the codons under positive selection. This method also estimates that 75.6% of the nonsilent mutations are deleterious and have been removed by selection prior to sampling. Using the larger data set and the modified methods, we confirmed a large (40%) excess of changes on the terminal branches. We also found an excess of changes on branches leading to egg-grown isolates. Furthermore, 9 of the 18 amino acid codons, identified as being under positive selection to change when we used only mutations assigned to internal branches, were not under positive selection on the terminal branches. Thus, although there is overlap between the selected codons on terminal and internal branches, the codons under positive selection on the terminal branches differ from those on the internal branches. We also observed that there is an excess of positively selected codons associated with the receptor-binding site and with the antibody-combining sites. This association may explain why the positively selected codons are restricted in their distribution along the sequence. Our results suggest that future studies of positive selection should focus on changes assigned to the internal branches, as certain of these changes may have predictive value for identifying future successful epidemic variants.

Journal ArticleDOI
TL;DR: Analysis of an MLST data set consisting of the sequences of approximately 450-bp fragments from seven housekeeping loci from a large strain collection of Neisseria meningitidis revealed that a single nucleotide site in a meningococcal housekeeping gene is at least 80-fold more likely to change as a result of recombination than as a results of mutation.
Abstract: Multilocus sequence typing (MLST) is a recently developed nucleotide sequence-based method for the definitive assignment of isolates within bacterial populations to specific clones. MLST uses the same principles as multilocus enzyme electrophoresis and provides data that can be used to investigate aspects of the population genetics and evolution of bacterial species. We used an MLST data set consisting of the sequences of approximately 450-bp fragments from seven housekeeping loci from a large strain collection of Neisseria meningitidis to estimate the relative impact of recombination compared with point mutation in the diversification of N. meningitidis clonal complexes. 126 meningococcal isolates were assigned to 10 clonal complexes, 9 of which contained minor clonal variants. The allelic variation within each complex was classified as a recombinational exchange or a putative point mutation through a comparison of the sequences of each variant allele with that of the allele typically found in the clonal complex. The nine clonal complexes contained a total of 23 allelic variants, and analysis of the sequences of these variant alleles revealed that a single nucleotide site in a meningococcal housekeeping gene is at least 80-fold more likely to change as a result of recombination than as a result of mutation. This value is estimated to be 10-50-fold for Escherichia coli and approximately 50-fold for Streptococcus pneumoniae.

Journal ArticleDOI
TL;DR: A phylogenetic analysis of G. intestinalis showed that this species includes genotypes that represent at least seven deeply rooted lineages, herein designated assemblages A-G, and suggested that G. microti is a member of this complex.
Abstract: The long-standing controversy regarding whether Giardia intestinalis is a single species prevalent in both human and animal hosts or a species complex consisting of morphologically similar organisms that differ in host range and other biotypic characteristics is an issue with important medical, veterinary, and environmental management implications. In the past decade, highly distinct genotypes (some apparently confined to particular host groups) have been identified by genetic analysis of samples isolated from different host species. The aim of this study was to undertake a phylogenetic analysis of G. intestinalis that were representative of all known major genetic groups and compare them with other Giardia species, viz. G. ardeae, G. muris, and G. microti. Segments from four "housekeeping" genes (specifying glutamate dehydrogenase, triose phosphate isomerase, elongation factor 1 alpha, and 18S ribosomal RNA) were examined by analysis of 0.48-0.69-kb nucleotide sequences determined from DNA amplified in polymerase chain reactions from each locus. In addition, isolates were compared by allozymic analysis of electrophoretic data obtained for 21 enzymes representing 23 gene loci. The results obtained from these independent techniques and different loci were essentially congruous. Analyses using G. ardeae and/or G. muris as outgroups supported the monophyly of G. intestinalis and also showed that this species includes genotypes that represent at least seven deeply rooted lineages, herein designated assemblages A-G. Inclusion of G. microti in the analysis of 18S rRNA sequence data demonstrated the monophyly of Giardia with the same median body morphology but did not support the monophyly of G. intestinalis, instead placing G. microti within G. intestinalis. The findings support the hypothesis that G. intestinalis is a species complex and suggest that G. microti is a member of this complex.

Journal ArticleDOI
TL;DR: It is demonstrated that with codon usage analysis, the proposed horizontally transferred genes can be distinguished from highly expressed genes.
Abstract: Glycosyl hydrolase (GH) genes from Escherichia coli and Bacillus subtilis were used to search for cases of horizontal gene transfer. Such an event was inferred by G + C content, codon usage analysis, and a phylogenetic congruency test. The codon usage analysis used is a procedure based on a distance derived from a Pearson linear correlation coefficient determined from a pairwise codon usage comparison. The distances are then used to generate a distance-based tree with which we can define clusters and rapidly compare codon usage. Three genes (yagH from E. coli and xynA and xynB from B. subtilis) were determined to have arrived by horizontal gene transfer and were located in E. coli CP4-6 prophage, and B. subtilis prophages 6 and 5, respectively. In this study, we demonstrate that with codon usage analysis, the proposed horizontally transferred genes can be distinguished from highly expressed genes.

Journal ArticleDOI
TL;DR: A eukaryotic rooting provides a simple explanation for the high similarity of Archaea and Bacteria observed in complete-genome analysis, and should prompt a reconsideration of current views on the origin of eukARYotes.
Abstract: The 54-kDa signal recognition particle and the receptor SR alpha, two proteins involved in the cotranslational translocation of proteins, are paralogs. They originate from a gene duplication that occurred prior to the last universal common ancestor, allowing one to root the universal tree of life. Phylogenetic analysis using standard methods supports the generally accepted cluster of Archaea and Eucarya. However, a new method increasing the signal-to-noise ratio strongly suggests that this result is due to a long-branch attraction artifact, with the Bacteria evolving fastest. In fact, the Archaea/Eucarya sisterhood is recovered only by the fast-evolving positions. In contrast, the most slowly evolving positions, which are the most likely to retain the ancient phylogenetic signal, support the monophyly of prokaryotes. Such a eukaryotic rooting provides a simple explanation for the high similarity of Archaea and Bacteria observed in complete-genome analysis, and should prompt a reconsideration of current views on the origin of eukaryotes.

Journal ArticleDOI
TL;DR: In humans, mitochondrial variation is characterized by an excess of rare frequency mutations and a negative D value, which has been interpreted as the result of a recent expansion in population size, and most nuclear loci are characterized by the opposite pattern.
Abstract: Whether or not humans have experienced a reduction in population size in the recent past is a controversial issue germane to the origin and colonization of our own species (Stringer and Andrews 1988). A change in population size can result in deviations from the neutral patterns of nucleotide variation expected at equilibrium. Using the frequency distribution of mutations segregating in extant populations, the magnitude of a deviation can be measured by Tajima’s (1989a) D statistic or by a number of alternative measures (Fu and Li 1993; Fu 1996). In a population of constant size, variation at a neutrally evolving locus is expected to have a D value of approximately zero. Following a reduction in population size, rare frequency mutations are lost more readily than are common mutations (Nei, Maruyama, and Chakraborty 1975), and transient positive D values are expected (Tajima 1989b). Following an increase in population size, there is a temporary excess of new mutations segregating at rare frequencies, and negative D values are expected. The sign of Tajima’s D subsequent to a population bottleneck can be positive, negative, or zero depending on the length of time since the bottleneck and the severity of the bottleneck. If a bottleneck is so severe that all variation is eliminated or lasts so long that the population reaches a new equilibrium, Tajima’s D follows the pattern produced by an expansion in population size. However, following an incomplete bottleneck, Tajima’s D is transiently positive before becoming negative and eventually approaching its equilibrium (Tajima 1989b). In humans, mitochondrial variation is characterized by an excess of rare frequency mutations and a negative D value, which has been interpreted as the result of a recent expansion in population size (Merriwether et al. 1991; Rogers and Harpending 1992). In contrast, most nuclear loci are characterized by the opposite pattern, an excess of common mutations and positive D values, which can result from a recent reduction in population size (Hey 1997; Harding et al. 1997; Clark et al. 1998; Zietkiewicz et al. 1998). The conflicting profiles of mitochondrial and nuclear variation have led to the suggestion that these patterns cannot be simultaneously accounted for by human population history, which must be shared by both genomes (Hey 1997).

Journal ArticleDOI
TL;DR: Gene fragments from 12 house-keeping loci distributed around the meningococcal chromosome were analyzed, showing that identical alleles are disseminated among genetically diverse isolates, with no evidence for linkage disequilibrium, and that a bifurcating treelike phylogeny is not an appropriate model for anything other than the short-term evolution of this species.
Abstract: The extent to which recombination disrupts the bifurcating treelike phylogeny and clonal structure imposed by binary fission on bacterial populations remains contentious. Here, we address this question with a study of nucleotide sequence data from 107 isolates of the human pathogen Neisseria meningitidis. Gene fragments from 12 house-keeping loci distributed around the meningococcal chromosome were analyzed, showing that (1) identical alleles are disseminated among genetically diverse isolates, with no evidence for linkage disequilibrium; (2) different loci give distinct and incongruent phylogenetic trees; and (3) allele sequences are incompatible with a bifurcating treelike phylogeny at all loci. These observations are consistent with the hypothesis that meningococcal populations comprise organisms assembled from a common gene pool, with alleles and allele fragments spreading independently, together with the occasional importation of genetic material from other species. Further, they support the view that recombination is an important genetic mechanism in the generation new meningococcal clones and alleles. Consequently, for anything other than the short-term evolution of this species, a bifurcating treelike phylogeny is not an appropriate model.

Journal ArticleDOI
TL;DR: It is found that in many cases, gene orders within operons could be shuffled frequently during evolution, although several operon structures, such as ribosomal protein operons, were well conserved, suggesting that shuffling of a genome structure is virtually neutral in long-term evolution.
Abstract: Gene orders have been shown to be generally unstable by comprehensive analyses in several complete genomes. In this study, we examined instability of genome structures within operons, where functionally related genes are clustered. We compared gene orders of known operons obtained from Escherichia coli and Bacillus subtilis with corresponding those of operons in 11 complete genome sequences. We found that in many cases, gene orders within operons could be shuffled frequently during evolution, although several operon structures, such as ribosomal protein operons, were well conserved. This suggests that shuffling of a genome structure is virtually neutral in long-term evolution. Moreover, degrees of instability of the operon structures depended on the genomes examined. Variation in degrees of instability of the genome structures was likely to be related to differences in amounts of insertion sequences. Effects on transcription regulation are also discussed in association with operon destruction.

Journal ArticleDOI
TL;DR: It is likely that turtles originated from a Permian-Triassic archosauromorph ancestor with two pairs of temporal fenestrae behind the skull orbit that were subsequently lost and the traditional classification of turtles in the Anapsida may need to be reconsidered.
Abstract: Turtles have highly specialized morphological characteristics, and their phylogenetic position has been under intensive debate. Previous molecular studies have not established a consistent and statistically well supported conclusion on this issue. In order to address this, complete mitochondrial DNA sequences were determined for the green turtle and the blue-tailed mole skink. These genomes possess an organization of genes which is typical of most other vertebrates, such as placental mammals, a frog, and bony fishes, but distinct from organizations of alligators and snakes. Molecular evolutionary rates of mitochondrial protein sequences appear to vary considerably among major reptilian lineages, with relatively rapid rates for snake and crocodilian lineages but slow rates for turtle and lizard lineages. In spite of this rate heterogeneity, phylogenetic analyses using amino acid sequences of 12 mitochondrial proteins reliably established the Archosauria (birds and crocodilians) and Lepidosauria (lizards and snakes) clades postulated from previous morphological studies. The phylogenetic analyses further suggested that turtles are a sister group of the archosaurs, and this untraditional relationship was provided with strong statistical evidence by both the bootstrap and the Kishino-Hasegawa tests. This is the first statistically significant molecular phylogeny on the placement of turtles relative to the archosaurs and lepidosaurs. It is therefore likely that turtles originated from a Permian-Triassic archosauromorph ancestor with two pairs of temporal fenestrae behind the skull orbit that were subsequently lost. The traditional classification of turtles in the Anapsida may thus need to be reconsidered.

Journal ArticleDOI
TL;DR: It is reported here that upstream genes in the anthocyanin pathway have evolved substantially more slowly than downstream genes and it is suggested that this difference in evolutionary rates may be explained by upstream genes being more constrained because they participate in several different biochemical pathways.
Abstract: The anthocyanin biosynthetic pathway is responsible for the production of anthocyanin pigments in plant tissues and shares a number of enzymes with other biochemical pathways. The six core structural genes of this pathway have been cloned and characterized in two taxonomically diverse plant species (maize and snapdragon). We have recently cloned these genes for a third species, the common morning glory, Ipomoea purpurea. This additional information provides an opportunity to examine patterns of evolution among genes within a single biochemical pathway. We report here that upstream genes in the anthocyanin pathway have evolved substantially more slowly than downstream genes and suggest that this difference in evolutionary rates may be explained by upstream genes being more constrained because they participate in several different biochemical pathways. In addition, regulatory genes associated with the anthocyanin pathway tend to evolve more rapidly than the structural genes they regulate, suggesting that adaptive evolution of flower color may be mediated more by regulatory than by structural genes. Finally, for individual anthocyanin genes, we found an absence of rate heterogeneity among three major angiosperm lineages. This rate constancy contrasts with an accelerated rate of evolution of three CHS-like genes in the Ipomoea lineage, indicating that these three genes have diverged without coordinated adjustment by other pathway genes.

Journal ArticleDOI
TL;DR: Findings are consistent with the view that during the evolution of the Hymenoptera, rearrangements increased at the same time that the rate of point mutations and compositional bias also increased, and may direct investigations into mitochondrial genome plasticity in other invertebrate lineages.
Abstract: The arrangement of tRNA genes at the junction of the cytochrome oxidase II and ATPase 8 genes was examined across a broad range of Hymenoptera. Seven distinct arrangements of tRNA genes were identified among a group of wasps that have diverged over the last 180 Myr (suborder Apocrita); many of the rearrangements represent evolutionarily independent events. Approximately equal proportions of local rearrangements, inversions, and translocations were observed, in contrast to vertebrate mitochondria, in which local rearrangements predominate. Surprisingly, homoplasy was evident among certain types of rearrangement; a reversal of the plesiomorphic gene order has arisen on three separate occasions in the Insecta, while the tRNA(H) gene has been translocated to this locus on two separate occasions. Phylogenetic analysis indicates that this gene translocation is real and is not an artifactual translocation resulting from the duplication of a resident tRNA gene followed by mutation of the anticodon. The nature of the intergenic sequences surrounding this region does not indicate that it should be especially prone to rearrangement; it does not generally have the tandem or inverted repeats that might facilitate this plasticity. Intriguingly, these findings are consistent with the view that during the evolution of the Hymenoptera, rearrangements increased at the same time that the rate of point mutations and compositional bias also increased. This association may direct investigations into mitochondrial genome plasticity in other invertebrate lineages.

Journal ArticleDOI
TL;DR: Phylogenetic reconstruction of amino acid replacements indicates that replacements yielding increased A + T predominated early in the evolution of Buchnera, with the trend slowing or stopping during the last 50 Myr, suggesting that base composition in BuchnerA has approached a limit enforced by selective constraint acting on protein function.
Abstract: A major limitation on ability to reconstruct bacterial evolution is the lack of dated ancestors that might be used to evaluate and calibrate molecular clocks. Vertically transmitted symbionts that have cospeciated with animal hosts offer a firm basis for calibrating sequence evolution in bacteria, since fossils of the hosts can be used to date divergence events. Sequences for a functionally diverse set of genes have been obtained for bacterial endosymbionts (Buchnera) from two pairs of aphid host species, each pair diverging 50-70 MYA. Using these dates and estimated numbers of Buchnera generations per year, we calculated rates of base substitution for neutral and selected sites of protein-coding genes and overall rates for rRNA genes. Buchnera shows homogeneity among loci with regard to synonymous rate. The Buchnera synonymous rate is about twice that for low-codon-bias genes of Escherichia coli-Salmonella typhimurium on an absolute timescale, and fourfold higher on a generational timescale. Nonsynonymous substitutions show a greater rate disparity in favor of Buchnera, a result consistent with a genomewide decrease in selection efficiency in Buchnera. Ratios of synonymous to nonsynonymous substitutions differ for the two pairs of Buchnera, indicating that selection efficiency varies among lineages. Like numerous other intracellular bacteria, such as Rickettsia and Wolbachia, Buchnera has accumulated amino acids with codons rich in A or T. Phylogenetic reconstruction of amino acid replacements indicates that replacements yielding increased A + T predominated early in the evolution of Buchnera, with the trend slowing or stopping during the last 50 Myr. This suggests that base composition in Buchnera has approached a limit enforced by selective constraint acting on protein function.

Journal ArticleDOI
TL;DR: The diversity of ITSs in M. arenaria, M. javanica, and M. incognita is suggested to be due to hybrid origins from closely related females (as inferred from mtDNA) and combinations of more diverse paternal lineages.
Abstract: Among root knot nematodes of the genus Meloidogyne, the polyploid obligate mitotic parthenogens M. arenaria, M. javanica, and M. incognita are widespread and common agricultural pests. Although these named forms are distinguishable by closely related mitochondrial DNA (mtDNA) haplotypes, detailed sequence analyses of internal transcribed spacers (ITSs) of nuclear ribosomal genes reveal extremely high diversity, even within individual nematodes. This ITS diversity is broadly structured into two very different groups that are 12%-18% divergent: one with low diversity (<1.0%) and one with high diversity (6%-7%). In both of these groups, identical sequences can be found within individual nematodes of different mtDNA haplotypes (i.e., among species). Analysis of genetic variance indicates that more than 90% of ITS diversity can be found within an individual nematode, with small but statistically significant (5%-10%; P < 0.05) variance distributed among mtDNA lineages. The evolutionarily distinct parthenogen M. hapla shows a similar pattern of ITS diversity, with two divergent groups of ITSs within each individual. In contrast, two diploid amphimictic species have only one lineage of ITSs with low diversity (<0.2%). The presence of divergent lineages of rDNA in the apomictic taxa is unlikely to be due to differences among pseudogenes. Instead, we suggest that the diversity of ITSs in M. arenaria, M. javanica, and M. incognita is due to hybrid origins from closely related females (as inferred from mtDNA) and combinations of more diverse paternal lineages.

Journal ArticleDOI
TL;DR: The results strongly suggest that there is a high spontaneous rate of deletions as well as a strong mutation bias toward AT pairs in the Rickettsia genomes.
Abstract: To study reductive evolutionary processes in bacterial genomes, we examine sequences in the Rickettsia genomes which are unconstrained by selection and evolve as pseudogenes, one of which is the metK gene, which codes for AdoMet synthetase. Here, we sequenced the metK gene and three surrounding genes in eight different species of the genus Rickettsia. The metK gene was found to contain a high incidence of deletions in six lineages, while the three genes in its surroundings were functionally conserved in all eight lineages. A more drastic example of gene degradation was identified in the metK downstream region, which contained an open reading frame in Rickettsia felis. Remnants of this open reading frame could be reconstructed in five additional species by eliminating sites of frameshift mutations and termination codons. A detailed examination of the two reconstructed genes revealed that deletions strongly predominate over insertions and that there is a strong transition bias for point mutations which is coupled to an excess of GC-to-AT substitutions. Since the molecular evolution of these inactive genes should reflect the rates and patterns of neutral mutations, our results strongly suggest that there is a high spontaneous rate of deletions as well as a strong mutation bias toward AT pairs in the Rickettsia genomes. This may explain the low genomic G + C content (29%), the small genome size (1.1 Mb), and the high noncoding content (24%), as well as the presence of several pseudogenes in the Rickettsia prowazekii genome.

Journal ArticleDOI
TL;DR: A scenario where these antelopes, previously with wide pan-African distributions, became extinct except in a few refugia is suggested, where the hartebeest, and probably also the topi, survived inRefugia north of the equator, in the east and the west, respectively, as well as one in the south.
Abstract: The phylogeography of three species of African bovids, the hartebeest (Alcelaphus buselaphus), the topi (Damaliscus lunatus), and the wildebeest (Connochaetes taurinus), is inferred from sequence variation of 345 sequences at the control region (d-loop) of the mtDNA. The three species are closely related (tribe Alcelaphini) and share similar habitat requirements. Moreover, their former distribution extended over Africa, as a probable result of the expansion of open grassland on the continent during the last 2.5 Myr. A combination of population genetics (diversity and structure) and intraspecific phylogeny (tree topology and relative branch length) methods is used to substantiate scenarios of the species history. Population dynamics are inferred from the distribution of sequence pairwise differences within populations. In the three species, there is a significant structuring of the populations, as shown by analysis of molecular variance (AMOVA) pairwise and hierarchical differentiation estimations. In the wildebeest, a pattern of colonization from southern Africa toward east Africa is consistent with the asymmetric topology of the gene tree, showing a paraphyletic position of southern lineages, as well as their relatively longer branch lengths, and is supported by a progressive decline in population nucleotide diversity toward east Africa. The phylogenetic pattern found in the topi and the hartebeest differs from that of the wildebeest: lineages split into monophyletic clades, and no geographical trend is detected in population diversity. We suggest a scenario where these antelopes, previously with wide pan-African distributions, became extinct except in a few refugia. The hartebeest, and probably also the topi, survived in refugia north of the equator, in the east and the west, respectively, as well as one in the south. The southern refugium furthermore seems to have been the only place where the wildebeest has survived.

Journal ArticleDOI
TL;DR: It is determined that two floral homeotic genes, ASAP3/TM6 and ASAP1, are found in duplicate copies within members of the Hawaiian silversword alliance and appear to have arisen as a result of interspecific hybridization between two North American tarweed species.
Abstract: The polyploid Hawaiian silversword alliance (Asteraceae), a spectacular example of adaptive radiation in plants, was shown previously to have descended from North American tarweeds of the Madia/Raillardiopsis group, a primarily diploid assemblage. The origin of the polyploid condition in the silversword alliance was not resolved in earlier biosystematic, cytogenetic, and molecular studies, apart from the determination that polyploidy in modern species of Madia/Raillardiopsis arose independent of that of the Hawaiian group. We determined that two floral homeotic genes, ASAP3/TM6 and ASAP1, are found in duplicate copies within members of the Hawaiian silversword alliance and appear to have arisen as a result of interspecific hybridization between two North American tarweed species. Our molecular phylogenetic analyses of the ASAP3/TM6 loci suggest that the interspecific hybridization event in the ancestry of the Hawaiian silversword alliance involved members of lineages that include Raillardiopsis muirii (and perhaps Madia nutans) and Raillardiopsis scabrida. The ASAP1 analysis also indicates that the two species of Raillardiopsis are among the closest North American relatives of the Hawaiian silversword alliance. Previous biosystematic evidence demonstrates the potential for allopolyploid formation between members of the two North American tarweed lineages; a vigorous hybrid between R. muirii and R. scabrida has been produced that formed viable, mostly tetraporate (diploid) pollen, in keeping with observed meiotic failure. Various genetic consequences of allopolyploidy may help to explain the phenomenal evolutionary diversification of the silversword alliance.

Journal ArticleDOI
TL;DR: Estimating the congruence of the various glutamate receptor gene regions showed that the different functional domains, including the two ligand-binding domains and the transmembrane regions, have coevolved, suggesting that they assembled together before plants and animals diverged.
Abstract: We performed a genealogical analysis of the ionotropic glutamate receptor (iGluR) gene family, which includes the animal iGluRs and the newly isolated glutamate receptor-like genes (GLR) of plants discovered in Arabidopsis. Distance measures firmly placed the plant GLR genes within the iGluR clade as opposed to other ion channel clades and indicated that iGluRs may be a primitive signaling mechanism that predated the divergence of animals and plants. Moreover, phylogenetic analyses using both parsimony and neighbor joining indicated that the divergence of animal iGluRs and plant GLR genes predated the divergence of iGluR subtypes (NMDA vs. AMPA/KA) in animals. By estimating the congruence of the various glutamate receptor gene regions, we showed that the different functional domains, including the two ligand-binding domains and the transmembrane regions, have coevolved, suggesting that they assembled together before plants and animals diverged. Based on residue conservation and divergence as well as positions of residues with respect to functional domains of iGluR proteins, we attempted to examine structure-function relationships. This analysis defined M3 as the most highly conserved transmembrane domain and identified potential functionally important conserved residues whose function can be examined in future studies.

Journal ArticleDOI
R A Volkov1, Nikolai Borisjuk1, I I Panchuk1, D Schweizer1, Vera Hemleben1 
TL;DR: Repeated sequences in allopolyploid genomes are targets for molecular rearrangement, demonstrating the dynamic nature of allopolyPloid genomes.
Abstract: Origin and rearrangement of ribosomal DNA repeats in natural allotetraploid Nicotiana tabacum are described. Comparative sequence analysis of the intergenic spacer (IGS) regions of Nicotiana tomentosiformis (the paternal diploid progenitor) and Nicotiana sylvestris (the maternal diploid progenitor) showed species-specific molecular features. These markers allowed us to trace the molecular evolution of parental rDNA in the allopolyploid genome of N. tabacum; at least the majority of tobacco rDNA repeats originated from N. tomentosiformis, which endured reconstruction of subrepeated regions in the IGS. We infer that after hybridization of the parental diploid species, rDNA with a longer IGS, donated by N. tomentosiformis, dominated over the rDNA with a shorter IGS from N. sylvestris; the latter was then eliminated from the allopolyploid genome. Thus, repeated sequences in allopolyploid genomes are targets for molecular rearrangement, demonstrating the dynamic nature of allopolyploid genomes.