scispace - formally typeset
Search or ask a question

Showing papers in "Evolutionary Bioinformatics in 2005"


Journal ArticleDOI
TL;DR: Arlequin ver 3.0 as discussed by the authors is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework.
Abstract: Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multi-locus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.

14,271 citations


Journal ArticleDOI
TL;DR: The most likely tree under the three models of DNA evolution is determined and compared with the one favoured by the tests for symmetry, and a method for assessing whether sequences have evolved under reversible conditions is presented.
Abstract: Biodiversity assessment demands objective measures, because ultimately conservation decisions must prioritize the use of limited resources for preserving taxa. The most general framework for the objective assessment of conservation worth are those that assess evolutionary distinctiveness, e.g. Genetic (Crozier 1992) and Phylogenetic Diversity (Faith 1992), and Evolutionary History (Nee & May 1997). These measures all attempt to assess the conservation worth of any scheme based on how much of the encompassing phylogeny of organisms is preserved. However, their general applicability is limited by the small proportion of taxa that have been reliably placed in a phylogeny. Given that phylogenizaton of many interesting taxa or important is unlikely to occur soon, we present a framework for using taxonomy as a reasonable surrogate for phylogeny. Combining this framework with exhaustive searches for combinations of sites containing maximal diversity, we provide a proof-of-concept for assessing conservation schemes for systematized but un-phylogenised taxa spread over a series of sites. This is illustrated with data from four studies, on North Queensland flightless insects (Yeates et al. 2002), ants from a Florida Transect (Lubertazzi & Tschinkel 2003), New England bog ants (Gotelli & Ellison 2002) and a simulated distribution of the known New Zealand Lepidosauria (Daugherty et al. 1994). The results support this approach, indicating that species, genus and site numbers predict evolutionary history, to a degree depending on the size of the data set.

126 citations


Journal ArticleDOI
TL;DR: This study considers gene location within bacteria as a function of genetic element mobility, and suggests that phage-encoded VFs could enhance phage Darwinian fitness, particularly by acting as ecosystem-modifying agents.
Abstract: This study considers gene location within bacteria as a function of genetic element mobility. Our emphasis is on prophage encoding of bacterial virulence factors (VFs). At least four mechanisms potentially contribute to phage encoding of bacterial VFs: (i) Enhanced gene mobility could result in greater VF gene representation within bacterial populations. We question, though, why certain genes but not others might benefit from this mobility. (ii) Epistatic interactions—between VF genes and phage genes that enhance VF utility to bacteria—could maintain phage genes via selection acting on individual, VF-expressing bacteria. However, is this mechanism sufficient to maintain the rest of phage genomes or, without gene co-regulation, even genetic linkage between phage and VF genes? (iii) Phage could amplify VFs during disease progression by carrying them to otherwise commensal bacteria colocated within the same environment. However, lytic phage kill bacteria, thus requiring assumptions of inclusive fitness within bacterial populations to explain retention of phage-mediated VF amplification for the sake of bacterial utility. Finally, (iv) phage-encoded VFs could enhance phage Darwinian fitness, particularly by acting as ecosystem-modifying agents. That is, VF-supplied nutrients could enhance phage growth by increasing the density or by improving the physiology of phage-susceptible bacteria. Alternatively, VF-mediated break down of diffusion-inhibiting spatial structure found within the multicellular bodies of host organisms could augment phage dissemination to new bacteria or to environments. Such phage-fitness enhancing mechanisms could apply particularly given VF expression within microbiologically heterogeneous environments, ie, ones where phage have some reasonable potential to acquire phage-susceptible bacteria.

77 citations


Journal ArticleDOI
TL;DR: Recent progress in the elucidation of mechanisms of protein and proteome evolution in which phylogenetics has played a determinant role is surveyed.
Abstract: The study of evolutionary relationships among protein sequences was one of the first applications of bioinformatics. Since then, and accompanying the wealth of biological data produced by genome sequencing and other high-throughput techniques, the use of bioinformatics in general and phylogenetics in particular has been gaining ground in the study of protein and proteome evolution. Nowadays, the use of phylogenetics is instrumental not only to infer the evolutionary relationships among species and their genome sequences, but also to reconstruct ancestral states of proteins and proteomes and hence trace the paths followed by evolution. Here I survey recent progress in the elucidation of mechanisms of protein and proteome evolution in which phylogenetics has played a determinant role.

40 citations


Journal ArticleDOI
TL;DR: MySSP is a new program for the simulation of DNA sequence evolution across a phylogenetic tree that is unique in its inclusion of indels, flexibility in allowing for non-stationary patterns, and output of ancestral sequences.
Abstract: MySSP is a new program for the simulation of DNA sequence evolution across a phylogenetic tree. Although many programs are available for sequence simulation, MySSP is unique in its inclusion of indels, flexibility in allowing for non-stationary patterns, and output of ancestral sequences. Some of these features can individually be found in existing programs, but have not all have been previously available in a single package.

35 citations


Journal ArticleDOI
TL;DR: A filtering technique that accelerates searches and algorithms for rooted and unrooted trees where the trees can be weighted or unweighted are developed.
Abstract: As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P. The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising.

28 citations


Journal ArticleDOI
TL;DR: The ALFRED database can serve the human anthropologic genetic community by identifying what loci are already typed on many populations thereby helping to focus efforts on a common set of markers.
Abstract: Many kinds of microevolutionary studies require data on multiple polymorphisms in multiple populations. Increasingly, and especially for human populations, multiple research groups collect relevant data and those data are dispersed widely in the literature. ALFRED has been designed to hold data from many sources and make them available over the web. Data are assembled from multiple sources, curated, and entered into the database. Multiple links to other resources are also established by the curators. A variety of search options are available and additional geographic based interfaces are being developed. The database can serve the human anthropologic genetic community by identifying what loci are already typed on many populations thereby helping to focus efforts on a common set of markers. The database can also serve as a model for databases handling similar DNA polymorphism data for other species.

23 citations


Journal ArticleDOI
TL;DR: This work proposes a strategy to adopt false discovery rate (FDR) and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data and identified known and potentially new ferric-uptake regulator (Fur) binding sites.
Abstract: The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR) and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur) binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.

2 citations